Race Recode Changes
For Data through 2003 (November 2005 Submission)
The algorithms for creating the race recode variables in the SEER
incidence and US mortality data were modified starting with the
November 2005 submission of data. All of the variable names
within the SEER*Stat and SEER*Prep software
have been modified for clarity and to avoid compatibility issues
between submissions of data.
For incidence and mortality rate calculations, we have recoded detailed
race information into four major categories in order to make them
compatible with available annual population estimates used as denominators
for the rates. These categories are:
- White
- Black
- American Indian/Alaskan Native
- Asian or Pacific Islander
The available race codes for the fields in the underlying incidence
and mortality data have changed over the years. For some years,
both the SEER incidence and NCHS
mortality data have had a code available
for “all other races”, when in fact every race was already
represented, and therefore the “all other races” code
was not needed. However, cases/deaths were coded to this category. In
the past, when creating the race recodes, all cases/deaths with the “all
other races” value have been treated as Asian or Pacific Islander. These
cases/deaths are now coded into a new race code in all of the race
recodes. This code is labeled as: “Other – unspecified
(1991+)” for incidence data and “Other – unspecified
(1978-1991)” for mortality data. This race category does
not have associated populations and is treated similar to "unknown"
race in most cases.
If you are interested in reproducing our previous methodology, you
can simply group the "Other – unspecified (year range)"
category with the appropriate category (depending on the race recode
you are working with).
The “Race/ethnicity” variable used to create the race
recodes in the SEER incidence data has been revised. Previously,
this field was simply Race1 from the NAACCR
file format. Now
this field is created from Race1, Race2, and the Indian Health Service
(IHS) Link variable. Race/ethnicity
starts as Race1. If Race1 is white and Race 2 is a specified
non-white race, then the value from Race2 is used. After this
check, if Race/ethnicity is still white and there is a positive IHS
Link, then Race/Ethnicity is set to American Indian/Alaskan Native.
Spanish-Hispanic-Latino Ethnicity
Hispanic is not mutually exclusive from Whites, Blacks, Asian/Pacific
Islanders, and American Indians/Alaska Natives.
Incidence data for Hispanics are based on NAACCR
Hispanic Identification Algorithm (NHIA). When producing statistics
using SEER Incidence data for Hispanic ethnicity, we exclude
cases from Hawaii, Seattle, Alaska Native Registry and Kentucky.
For state exclusions that SEER uses when producing Hispanic
(and non-Hispanic) mortality rates, see Policy
for Calculating Hispanic Mortality.
Combining Race and Ethnicity in Rate Analyses
Some SEER incidence and mortality databases in SEER*Stat
are now linked to both race (White, Black, AI/AN, API) and Hispanic
origin within the same database. While this provides the ability
to produce rates for the 8 combinations of these variables, the SEER
Program does not recommend using all of the combinations. SEER
only reports Hispanic/non-Hispanic rates for the races of all races
combined, white, and non-white.
American Indian/Alaskan Native Statistics
When producing statistics using SEER Incidence data for American
Indians/Alaska Natives, SEER only includes cases from Connecticut,
Detroit, Iowa, New Mexico, Seattle, Utah, Atlanta, and the Alaska
Native Registry and excludes cases diagnosed in 2003. |