Attention: SEER is not taking any new requests for specialized databases. We will be launching a new data request system soon which will include updated databases. Please check back later to submit a new request.

Human papillomavirus (HPV) infection is a prognostic factor for certain Head and Neck malignant tumors. SEER has collected data on the HPV status of patients with Head and Neck tumors, as defined by the following CS Collaborative Stage Data Collection System, version 02.05 schemas: Hypopharynx, Nasopharynx, Oropharynx, Pharyngeal Tonsil, Pharynx Other, Palate Soft, Tongue Base. Currently, data are available for patients diagnosed between 2010 and 2017 in SEER areas. The HPV status information received from SEER registries has been recoded in the following: 1) HPV Negative; 2) HPV Positive; 3) Unknown/NA.

Database Details

There are two Head and Neck with HPV Status databases available to request:

Both databases are available in the Case Listing, Frequency, and Survival sessions in SEER*Stat for the November 2020 data submission. They are identical to the SEER Research Plus database other than the specialized fields.

  1. Head and Neck with HPV Status database
    • This one is linked to county-level attributes, which include county-level SES, rurality, and demographics.
    • It includes all tumor records from 2000-2018, but the HPV status field is available just for 2010-2017.
    • Tumors diagnosed in 2018 were not available because the data element of CS Site-Specific Factor 10 used to derive HPV status for all seven schemas is discontinued starting from 2018.
  2. Head and Neck with HPV Status and Census Tract-level SES/Rurality Combined Database
    • Due to concerns over confidentiality, geographic identifiers, such as state, registry identifier, and county, are excluded from the combined database.
    • It does not include Alaska Native Tumor Registry data.
    • It includes all tumor records from 2006-2018, but the HPV status field is available just for 2010-2017.
    • For detailed information about census tract-level SES and rurality variables, refer to Census Tract-level SES and Rurality Database.

Data Limitations and Analytical Considerations

Incompleteness of the HPV status

SEER registries continued to improve with respect to collection of HPV status and the proportion of cases with known status (HPV Positive or HPV Negative) increased significantly between 2010, the first year of data collection, and 2016, the most current year for which data is available. However, the proportion of cases with unknown status remain non-negligible especially for cases diagnosed from earlier years. For example, HPV status was unknown for 73% of new oropharynx cancer cases in 2010 and 28% in 2016. Because the distribution of HPV (positive cases as a proportion of positive cases plus negative cases) is very likely not the same among those with known status as is among those with unknown status, estimation of HPV-positive incidence (rate) or prevalence would be biased.

In addition, as the proportion of cases with unknown HPV status varies over time, trends of HPV-positive incidence would likely to be biased. Due to the above reasons, the use of the dataset for estimation of HPV-positive incidence or prevalence, or the trend of incidence by HPV status is strongly discouraged.

Limited Follow-up Duration

While NCI releases data back to 2010 to allow for the calculation of 5-year survival using large patient datasets, estimation of trend in survival is not supported because longer follow-up time is required for calculating survival estimates at several time points.