In previous years, statistics for race included Hispanics, except for the category of Non-Hispanic White. Starting with the November 2021 data submission, released in April 2022, race and ethnicity are reported in five mutually exclusive categories:
- Non-Hispanic White
- Non-Hispanic Black
- Non-Hispanic Asian/Pacific Islander (API)
- Non-Hispanic American Indian/Alaska Native (AI/AN)
- Hispanic
This change in reporting resulted in a small change for most groups but a larger increase in rates for AI/AN. The larger increase in rates for Non-Hispanic AI/AN reflects the removal of Hispanic AI/AN, a group which had very low incidence rates. These low rates for Hispanic AI/AN may be influenced by several factors, including how missing race is assigned by the Census and misclassification of race in the cancer data, resulting in some level of uncertainty in Hispanic by race population estimates. Reporting incidence in the five mutually exclusive categories is consistent with mortality reporting from National Center of Health Statistics and presents a clearer picture of risk in the AI/AN population. SEER does not recommend producing rates for Hispanic AI/AN or Hispanic API.
Spanish-Hispanic-Latino Ethnicity
Incidence data for Hispanics are based on NAACCR Hispanic Identification Algorithm (NHIA) (PDF). SEER no longer excludes cases from the Alaska Native Tumor Registry when producing Hispanic and Non-Hispanic incidence rates.
For state exclusions that SEER uses when producing Hispanic (and Non-Hispanic) mortality rates, see Policy for Calculating Hispanic Mortality.
American Indian/Alaskan Native Statistics
When producing statistics using SEER incidence data for American Indians/Alaska Natives, SEER frequently only includes cases that are in a Purchased/Referred Care Delivery Area (PRCDA). We also recommend limiting to Non-Hispanic.
In SEER incidence and NCHS mortality databases, the PRCDA 2020 variable is used starting with data through 2020. Refer to the information on Purchased/Referred Care Delivery Area (PRCDA) for variables used in previous submissions of data.
Race/Ethnicity Variable Definitions in SEER Data
Race and origin (recommended by SEER)
Starting with the November 2022 data submission, SEER includes a system-supplied merged variable, "Race and origin (recommended by SEER)". It includes the five mutually exclusive race and ethnicity categories SEER uses for reporting cancer statistics.
Algorithms for Creating Variable Definitions
The following describes the algorithms for creating the race and origin recode variables in the SEER incidence and U.S. mortality data.
Race Recode
- We recoded detailed race information into four major categories in order to make them compatible with available annual population estimates used as denominators for the rates: White, Black, American Indian/Alaska Native, and Asian Pacific Islander.
- For some years, both the SEER incidence and NCHS mortality data have had a code available for “all other races”, when in fact every race was already represented, and therefore the “all other races” code was not needed. These cases are now coded as "unknown" race.
- The race recodes in the SEER incidence data are created from the Race1 and Indian Health Service (IHS) Link variables. If Race1 is white, unknown, or other and the IHS Link is positive, then race/ethnicity is set to American Indian/Alaskan Native, otherwise race/ethnicity is set to the Race1 value.
Origin Recode
Incidence data for Hispanics are based on NAACCR Hispanic Identification Algorithm (NHIA) (PDF) and are recoded into two main categories for the Origin Recode NHIA variable: Non-Spanish-Hispanic-Latino and Spanish-Hispanic-Latino.
Race and Origin Recode
From the two fields above, SEER provides the Race and Origin Recode variable with the following values:
- Non-Hispanic White
- Non-Hispanic Black
- Non-Hispanic American Indian/Alaska Native
- Non-Hispanic Asian or Pacific Islander
- Hispanic (All races)
- Non-Hispanic Unknown Race