The time-dependent census tract attributes include a socioeconomic status (SES) index, the seven tract-level SES attributes that were used to construct the index and additional attributes. The attributes and the index are estimated at various time points using data obtained from a series of American Community Survey (ACS) 5-year estimates from 2006 to 2019. The census tract geography definition for these data is what was used in the 2006-10 ACS. At each time point, an SES index is constructed from these attributes using a factor analysis (Yu et al. 2014).
The selection of attributes for constructing the SES index is based on Yost et al. (2001). Below are the tables and formulas used to calculate the base variables used in constructing the SES score. The tables are from the 2015-2019 American Community Survey (ACS) 5 –year estimates. Similar tables were used from the 2006-2010, 2007-2011, 2008-2012, 2009-2013, 2010-2014, 2011-2015, 2012-2016, 2013-2017, and 2014-2018 ACS.
Median Household Income
Median household income is taken from table B19013: Median Household Income in the Past 12 Months (in 2019 inflation adjusted dollars). The inflation adjustment is different for each of the data sources (i.e., 2014-18 ACS median income is in 2018 inflation adjusted dollars).
Median House Value
Median house value is taken from table B25077: Median Value (Dollars).
Median Gross Rent
Median gross rent is taken from table B25064: Median Gross Rent (Dollars)
Percent Below 150% Poverty
The percent below 150% of poverty line is calculated using table C17002: Ratio of Income to Poverty Level in the Past 12 Months. The formula for this is:
- Below 150% poverty: ((C17002e02 + C17002e03 + C17002e04 + C17002e05) / C1702e01) * 100
The education index is calculated using table B15002: Sex by Educational Attainment for the Population 25 Years and Over. The percent with less than high school graduate, high school only and more than high school are calculated, as follows:
- Less than HS grad: ((B15002e03 + ... + B15002e10 + B15002e20 + ... + B15002e27) / B15002e01) * 100
- HS Only: ((B15002e11+B15002e28)/B15002e01) * 100
- More than HS grad: ((B15002e12 + ... + B15002e18 + B15002e29 + ... + B15002e35) / B15002e01) * 100
- Education Index: (Less than HS grad * 9) + (HS only *12) + (More than HS grad * 16)
The percent working class is calculated using table C24010: Sex by Occupation for the Civilian Employed Population 16 Years and Over. The formula for this is:
- Working Class: ((C24010e20 + C24010e24 + C24010e25 + C24010e26 + C24010e27 + C24010e30 + C24010e34 + C24010e56 + C24010e60 + C24010e61 + C24010e62 + C24010e63 + C24010e66 + C24010e70)) / C24010e01) * 100
The percent of persons ages 16 and over who are unemployed is calculated using table B23025: Employment Status for the Population 16 Years and Over. The percent unemployed is calculated for civilians in the labor force. Persons in the armed forces or not in the labor force are not included in the calculation. The formula used is:
- Unemployed: (B23025e05 / B23025e03) * 100
The following variables are included but are not used in creating the index.
Percent Persons Below Poverty
For each timepoint the percent of persons below poverty is calculated for all races, White alone, Black or African American alone, American Indian and Alaska Native alone, Asian alone + Native Hawaiian and Other Pacific Islander alone*, Some other race alone + two or more races, White alone (not Hispanic or Latino) and Hispanic or Latino. Tables B17001, B17001A-B17001I: Poverty Status in the Past 12 Months by Sex and Age. The formulas used are:
- All races: ((B17001e02/B17001e01) * 100
- White alone: ((B17001Ae02/B17001Ae01) * 100
- Black or African American alone: ((B17001Be02/B17001Be01) * 100
- American Indian and Alaska Native alone: ((B17001Ce02/B17001Ce01) * 100
- Asian alone + Native Hawaiian and Other Pacific Islander alone: (((B17001De02+ B17001Ee02)/( B17001De01+ B17001Ee01)) * 100
- Some other race alone + two or more races: (((B17001Fe02+ B17001Ge02)/( B17001Fe01+ B17001Ge01)) * 100
- White alone (not Hispanic or Latino): ((B17001He02/B17001He01) * 100
- Hispanic or Latino: ((B17001Ie02/B17001Ie01) * 100
NAACCR Poverty Indicator
This variable is created from the percent of persons below poverty for all races above. The variable has four categories: less than 5 percent, 5 percent to less than 10 percent, 10 percent to less than 20 percent and 20 percent or more.
2010 Rural-Urban Commuting Area (RUCA) Codes
Two Rural-Urban Commuting Area variables are constructed from the 2010 data available from the U.S. Department of Agriculture Economic Research Service. The data were used to create a variable with 4 categories (urban, large rural, small rural and isolated rural) and a variable with 2 categories (urban and rural). The same 2010 values are used for all years.
2010 Census-based Urban Rural Indicator Codes (URIC)
The Urban Rural Indicator Code (URIC) is an urban-rural classification variable constructed from the 2010 U.S. Census’s percent of the population living in non-urban areas with four categories: 100% urban (All urban), ≥50% but <100% urban (Mostly urban), >0% but <50% urban (Mostly rural), and 100% rural (All rural) tracts. Rural population encompasses all population not included within an urban area. This measure reflects the rural nature of the immediate environment and may be most relevant for studies that focus on behaviors and risk, e.g., cancer prevention and screening studies. The same 2010 values are used for all years.
Cancer Reporting Zone
Cancer reporting zone is a new field identifying custom geographic areas with similar populations, and large enough to support stable cancer rates and minimize suppression. This field is defined for cancer registries that have defined zones for their catchment areas. A crosswalk of census tract 2010 to cancer reporting zone is available for these areas and is used to assign cancer reporting zone based on geocoded census tract. Cancer reporting zone identifies the unique cancer reporting zone across registries consisting of the 2-digit state FIPS code followed by a 8 character cancer reporting zone unique to a registry’s catchment area.
Census Tract Required for Cancer Reporting Zone
This variable can be used in conjunction with census tract certainty when selecting cases for cancer zone analyses. This includes cases that have high tract-level certainty (1, 2, 6) or that have low tract-level certainty but do not need a tract to assign a Cancer Reporting Zone.
CDC Standardized Sub-County Geographies
The Centers for Disease Control and Prevention’s National Environmental Public Health Tracking Program created standardized sub-county geographies that are comparable over time, place, and outcomes. These standardized sub-county geographies use census tracts as the foundation, have a hierarchical structure, and generally nest within county boundaries. They were created using the Geographic Aggregation Tool to merge based on the nearest population-weighted centroid until a specified population threshold was reached. There are three subcounty variables, a 5,000 population, 20,000 population, and a 50,000 population version. The values are used for all years.
Persistent Poverty Census Tracts
Persistent poverty census tracts are defined by the U.S. Department of Agriculture as areas where 20 percent or more of the residents were poor as measured by each of the 1980, 1990, 2000 censuses, and 2007-11 American Community Survey 5-year average. This definition was used to create a 2-level variable (census tract classified as persistent poverty or non- persistent poverty). The same variable is used for all years.
CDC/ATSDR Social Vulnerability Index
The Agency for Toxic Substances and Disease Registry (ATSDR) uses data from fifteen U.S. census variables to develop the Social Vulnerability Index (SVI). The overall index is created from four related themes. The themes are Socioeconomic Status, Household Composition, Race/Ethnicity/Language and Housing/Transportation. The 2018 state and U.S. based values for the four themes and the overall index are used for all years.
Congressional districts are identified by a 2-character numeric state FIPS code and a 2-character district code. This generates congressional district IDs that are numbered uniquely across the U.S. The district code used by the Census Bureau for at large states is 00. The District of Columbia and the Commonwealth of Puerto Rico have the district code of 98, which identifies their status with respect to representation in Congress:
- 01 to 53—Congressional district codes
- 00—At large (single district for state)
- 98—Nonvoting delegate
In Connecticut, Illinois, and Michigan the state participant did not assign the congressional districts to cover all of the state or equivalent area. The code “ZZ” has been assigned to areas with no congressional district defined (usually large water bodies). These unassigned areas are treated within state as a single congressional district for purposes of data presentation.
Yost K, Perkins C, Cohen R, Morris C, Wright W (2001) Socioeconomic status and breast cancer incidence in California for different race/ethnic groups. Cancer Causes Control 12(8):703–711. doi:10.1023/a:1011240019516.
Yu M, Tatalovich Z, Gibson JT, Cronin KA. Using a composite index of socioeconomic status to investigate health disparities while protecting the confidentiality of cancer registry data. Cancer Causes & Control. 2014;25(1):81-92.
Werner AK, Strosnider H, Kassinger C, Shin M; Sub-County Data Project Workgroup. Lessons Learned From the Environmental Public Health Tracking Sub-County Data Pilot Project. J Public Health Manag Pract. 2018;24(5):E20-E27. doi:10.1097/PHH.0000000000000686
Werner AK, Strosnider HM. Developing a surveillance system of sub-county data: Finding suitable population thresholds for geographic aggregations. Spat Spatiotemporal Epidemiol. 2020 Jun;33:100339. doi: 10.1016/j.sste.2020.100339.