Background

The National Cancer Institute Surveillance Research Program (NCI-SRP) has developed a new quality control tool allowing Surveillance, Epidemiology, and End Results (SEER) registry managers to monitor their registry data quality. The Median/Multiple Outlier Testing Method (MMOT) is a benchmarking tool for identifying outliers using multiple hypothesis testing based on the median of the statistic of interest (e.g., proportion unknown). The goal of this method is to identify any outlier datapoint from one SEER registry for a given diagnosis year. This will allow registry managers to investigate issues in data collection, coding, or registry operations beyond characteristics attributable to the underlying patient population.

The Median/Multiple Outlier Testing Method (MMOT)

  1. Define the population of interest e.g., primary site, diagnosis years
  2. Calculate the statistic of interest e.g., proportion unknown for summary stage
  3. Multiple hypothesis testing defines the boundaries, identifying outliers based on the median
    • If a registry for a given year falls above the upper boundary then the statistic of interest is higher when compared to the other registries
    • Similarly, if a registry for a given year falls below the lower boundary the statistic of interest is lower when compared to the other registries

Data Quality Monitoring Plan

Currently, SEER routinely evaluates the following data items from the most recent November submission using this MMOT tool. However, benchmarking analyses are not limited to these data items.

Data Item Schema SSDI Recode #
Breslow Tumor Thickness Melanoma, Skin 3817 R
Estrogen Receptor Summary Breast 3827 R
Progesterone Receptor Summary Breast 3915 R
HER2 Overall Summary Breast 3855 R
Gleason Score Clinical Prostate 3840 R
Gleason Score Pathological Prostate 3841 R
Gleason Patterns Clinical Prostate 3838 R
Gleason Patterns Pathological Prostate 3839 R
Prostate Specific Antigen (PSA) Lab Value Prostate 3920 R
KRAS Colorectal 3866 R

Results are posted on the QIE portal with registries deidentified as well as emailed to registry managers. A registry identification number will be emailed to each registry manager.

Defining Proportion Unknown and Aggressive

Proportions were calculated using the Site-Specific Data Item (SSDI) recode definitions to examine their distribution over time. Below is a table that briefly describes how proportions unknown and aggressive are calculated. For more detailed information on the SSDI recode, please refer to the SSDI recode definitions available on the Site-Specific Data Item Recodes page. Selection criteria for each data element that were analyzed using this tool are provided with the results.

Data Item Schema SSDI Recode # Proportion Unknown:
Numerator/ Denominator
Proportion Aggressive*:
Numerator/ Denominator
Breslow Thickness Melanoma, Skin 3817 R XX.9/ 0.0-9.7, XX.0, XX.9 4.0-9.79, XX.0/ 0.0-9.79, XX.0
Estrogen Receptor Summary Breast 3827 R 7, 9/ 0, 1, 7, 9 0/ 0, 1
Progesterone Receptor Summary Breast 3915 R 7, 9/ 0, 1, 7, 9 0/ 0, 1
HER2 Overall Summary Breast 3855 R 7, 9/ 0, 1, 7, 9 1/ 0, 1
Gleason's Score Clinical Prostate 3840 R X9/ 02-10, X7, X9 09,10/ 02-10, X7
Gleason's Score Pathological Prostate 3841 R X9/ 02-10, X7, X9 09,10/ 02-10, X7
Gleason Patterns Clinical Prostate 3838R 19, 29, 39, 49, 59, X6, X9/11-59, X6, X7, X9 45, 54, 55/11-15, 21-25, 31-34, 41-45, 51-55, X7
Gleason Patterns Pathological Prostate 3839R 19, 29, 39, 49, 59, X6, X9/11-59, X6, X7, X9 45, 54, 55/11-15, 21-25, 31-34, 41-45, 51-55, X7
Prostate Specific Antigen (PSA) Lab Value Prostate 3920 R XXX.2, XXX.3, XXX.7, XXX.9/ 0.1-97.9, XXX.0, XXX.9 20.0-97.9, XXX.0/ 0.1-97.9, XXX.0
KRAS Colorectal 3866 R 7, 9/ 0, 5, 7, 9 5/ 0, 5
*Cases that were coded N/A or Test ordered, results not in chart are removed.