Attention: SEER is not taking any new requests for specialized databases. We will be launching a new data request system soon which will include updated databases. Please check back later to submit a new request.
The specialized Census Tract-level SES and Rurality Database (2006-2018) has five census tract-level attributes: two socioeconomic status (SES) quintiles, two rurality variables, and persistent poverty.
SEER also offers combined databases with census tract-level SES/Rurality fields and additional data items, such as watchful waiting in prostate cancers, and HPV status in Head and Neck cancers. For information on these databases, refer to the list of Specialized Databases.
The database is available in Case Listing, Frequency, Rate, and Survival sessions in SEER*Stat for the November 2020 data submission.
- This database includes all tumor records for 2006-2018.
- This database is identical to the SEER Research Plus database other than the additional SES/Rurality fields.
- There are no geographic identifiers included in this database due to confidentiality concerns.
- It does not include Alaska Native Tumor Registry data.
Variable Definitions and Analytical Considerations
Two census tract-level SES quintiles are provided for assessing SES differences in cancer incidence and survival. The first quintile referred to as “Yost State-based quintile” facilitates the comparison of cancer patients of different SES within the same state. The second quintile referred to as “Yost US-based quintile” facilitates the SES comparison across the entire US.
Both quintiles are constructed based on the same two-step approach using census tract-level American Community Survey (ACS) 5-year estimates except that state-based quintile is constructed using state-specific data for each state separately and US-based quintile is constructed at once for the entire US. Specifically, the first step of this approach is to estimate composite SES scores (also referred to as SES index) for census tracts using a factor analysis from seven variables that measure different aspects of the SES of a census tract (Yu et al., 2014). The SES variables are chosen based on Yost et al. (2001). They are: Median household income, Median house value, Median rent, Percent below 150% of poverty line, Education Index (Liu et al., 1998), Percent working class, and Percent unemployed. The same variables are used for constructing the NCI's county-level time-dependent SES index. Definitions of these variables are described in the Time-Dependent County-level Attributes Section. In the second step, census tracts are categorized into SES quintiles with equal populations in each quintile within a state (State-based) or across the entire US (US-based). The first quintile (the group with the lowest SES) is the 20th centile or less, and the fifth quintile (the group with the highest SES) corresponds to the 80th centile or higher.
After SES quintiles are generated using various sets of ACS 5-year estimates, they are linked to tumor cases at the census tract level by matching the ACS survey year with the tumor diagnosis year. Tumors diagnosed in 2006-2007 are linked with SES quintiles calculated using ACS 2006-2010 data. Tumors diagnosed in 2008-2017 are linked to SES quintiles based on ACS 2006-2010, 2007-2011, 2008-2012, 2009-2013, 2010-2014, 2011-2015, 2012-2016, 2013-2017, 2014-2018, and 2015-2019 data respectively. Finally, tumors diagnosed in 2018 are linked to the index estimated from 2015-2019 ACS data. All SES quintiles are defined using the Decennial Census 2010 census tract boundaries.
Two census tract-level rurality variables are provided to facilitate analyses of urban/rural differences in cancer incidence and survival (Moss et al., 2019). The first is based on the U.S. Department of Agriculture (USDA)’s Rural Urban Commuting Area (RUCA) codes with two categories: Urban area commuting focused (codes 1.0, 1.1, 2.0, 2.1, 3.0, 4.1, 5.1, 7.1, 8.1, and 10.1) and Not urban area commuting focused (all other codes). The second is referred to as the Urban Rural Indicator Code (URIC) is based on the Census Bureau’s percent of the population living in non-urban areas with four categories: 100% urban (All urban), ≥50% but <100% urban (Mostly urban), >0% but <50% urban (Mostly rural), and 100% rural (All rural) tracts. The two-category RUCA measure is most commonly used in health research papers that use RUCA-based measures. The four-category Census-based measure reflects the rural nature of the immediate environment and may be most relevant for studies that focus on behaviors and risk. It can be collapsed into the two- or three-category versions in several ways and, thus, provides a good deal of flexibility to the researcher. These measures are also compatible with the rurality measures available with the NAACCR Cancer in North America database. For both rurality variables, the same 2010 values defined using the Decennial Census 2010 census tract boundaries are used for all years.
Persistent poverty variable identifies census tracts as being persistently poor if 20% or more of the population has lived below the poverty level for a period spanning about 30 years based on 1990, 2000 decennial censuses, and 2007-11 and 2015-19 American Community Survey 5-year estimates. It was developed by the National Cancer Institute in collaboration with the US Department of Agriculture, Economic Research Service (ERS). This variable has two levels: census tract classified as persistent poverty or non- persistent poverty. The same variable is used for all years.
Note that census tracts may have inflated poverty rates, thus more likely to be identified as persistent poverty, if postsecondary undergraduate students make up a significant portion of the residence poverty population. It is understood that postsecondary students tend to report low incomes and their poverty is uniquely situational compared to other poverty groups. When considering this group, it should be noted that a portion of the student population may come from a poor family and their poverty is potentially more chronic. Considering this complexity, please use cautions in interpreting the results.
For detailed information about Persistent Poverty, refer to USDA ERS - Rural Poverty & Well-Being.
Bridged Single-race Population Denominators for Census Tracts
The SEER census tract SES incidence database supports the calculation of incidence rates by census tract SES quintile (or tertile), race/ethnicity, single year of diagnosis from 2006-2018, 5-year age grouping (i.e., 0-4 years, 5-9, 10-14, ……, 80-84, and 85 and older) and gender. The race/ethnicity categories are Non-Hispanic (NH) White, NH-Black, NH-American Indian and Alaska Native (AIAN), NH-Asian Pacific Islander (API), and Hispanic.
The population denominator estimates are produced by Woods & Poole Economics, Inc. (W&P) based on a hybrid regression, demographic, and proportional model jointly developed by the NCI, W&P, and the North American Association of Central Cancer Registries (NAACCR) with support from NCI through a contract. They match to the Census Bureau’s Vintage 2019 bridged single-race population estimates for 2010-2018, and intercensal population estimates for 2006-2009 when tracts are collapsed to counties. Uncertainties about these estimates are not reflected. Cautions should be exercised in using these estimates, especially when the sample is small.
Linked to the Specialized Census Tract SES/Rurality Database are population estimates for census tracts in SEER 18 areas excluding Alaska. Estimates for U.S. tracts in SEER and non-SEER areas are also available to anyone who requests them and agrees to certain standard data use conditions. Refer to U.S. Census Tract Population Data to learn more about the methods used and how to submit a request.
Moss JL, Stinchcomb DG, Yu M. Providing higher resolution indicators of rurality in the Surveillance, Epidemiology, and End Results (SEER) database: Implications for patient privacy and research. Cancer Epidemiol Biomarkers Prev. 2019 Jun 14.[Epub ahead of print] [Abstract]
Yu M, Tatalovich Z, Gibson JT, Cronin KA. Using a composite index of socioeconomic status to investigate health disparities while protecting the confidentiality of cancer registry data. Cancer Causes Control. 2014 Jan;25(1):81-92. [Abstract]
Yost K, Perkins C, Cohen R, Morris C, Wright W. Socioeconomic status and breast cancer incidence in California for different race/ethnic groups. Cancer Causes Control. 2001 Oct;12(8):703-11. [Abstract]
Liu L, Deapen D, Bernstein L. Socioeconomic status and cancers of the female breast and reproductive organs: a comparison across racial/ethnic populations in Los Angeles County, California (United States). Cancer Causes Control. 1998 Aug;9(4):369-80. [Abstract]