Delay Adjustment
Every April, the NCI releases cancer statistics based on data submitted to the NCI in November of the previous year. For example, in April 2024, cancer statistics will be based on data available as of November 2023. It can be several months between the diagnosis of a case and when the cancer registry receives complete information on that case. In November 2023, information for cases diagnosed through 2021 are considered sufficiently complete to calculate official statistics, whereas data from diagnosis year 2022 still have a substantial undercount. Even with a lag time of nearly two years to increase data completeness, new cases and additional information will become available to the cancer registries in the future. A model is used to account for this delay, called the November submission delay model.
To obtain more up-to-date information on cancer trends and rates, the model was extended to model the undercount of cases based on an earlier look at the data. Data from nine months earlier (February 2024, rather than November 2024) were used to determine preliminary estimates of rates and trends through the 2022 diagnosis year. The preliminary estimates presented on the Preliminary Estimates for 2022 page are based on data submitted to the NCI in February as well as the application of the February submission delay model to the data. For more information about the delay model, visit Cancer Incidence Rates Adjusted for Reporting Delay.
The graph below shows the effect of delay adjustment as compared with observed data for all cancer sites combined. The February 2017 delay-adjusted rate is greater than the February 2017 observed rate to make up for the undercount of cases in the February 2017 submission. The February and November delay-adjusted rates are fairly similar, even though the size of the adjustment is much larger for February. The fact that the February and November delay-adjusted rates are similar provides some validation of the February submission delay model. For more information, see Validation.
Selected Registries Based on Completeness
To further ensure that rates and trends derived from a February submission are as accurate as possible, a second step was conducted limiting the registries that meet a specified completeness criteria. Completeness is data quality measure used as part of the SEER Data Quality Program (DQP). It is measured by computing an expected count for each registry estimated by projecting counts of cases from the series of initial November submissions (which occurs 22 months after a calendar year ends) for an additional year, and then comparing this expected count to what is observed. The observed to expected case count ratio is called the "completeness measure" and the DQP standard is 95% for the subsequent February submission, and 98% for the subsequent November submission. Of the full set of SEER 22 registries, 5 registries (Illinois, New York, Idaho, Texas, and Massachusetts) have not been part of SEER long enough to have a sufficient history of February submissions to estimate delay-adjusted rates. For the remaining 17 registries, the completeness ratio is computed, and registries which do not meet a specified criteria are eliminated from consideration. For the February 2024 submission, only registries that met a 95% completeness threshold are included in the preliminary estimates through 2022. While 95% will remain the DQP standard, in future years the threshold for including registries in the preliminary estimates may vary based on further evaluation.