Create a table showing the number of women in the SEER 18 Registries diagnosed with a malignant breast cancer in the 5-year period prior to January 1, 2018. That is, create a table showing January 1, 2018, 5-year limited-duration prevalence (crude percents and counts) for the SEER 18 Registries. Include only one primary for each woman.
Show the results by time prior to prevalence date. Include five one-year discrete time intervals prior to prevalence date: less than or equal to 1 year, greater than 1 but less than or equal to 2 (denoted in reports as >1-2), >2-3, >3-4, and >4-5 years. Since we are only interested in the first malignant breast cancer diagnosis in the entire 5-year period, a woman will count once in the total period and in only one interval.
Include standard errors, confidence intervals, and survival tables in the results.
Key Points
- The selection of cases differs somewhat in Limited-Duration Prevalence sessions compared to other sessions. The process of selecting records in prevalence sessions involves both the Selection Tab and certain settings on the Statistic Tab. Specifically, selections related to the date of diagnosis are made based on the settings for prevalence date and duration. Both of these are set on the Statistic Tab.
- In this exercise, we will use "First Primary Matching Selection Criteria" as the Multiple Primary Selection option. This will select the first primary cancer that matches the selection criteria on both the Selection and Statistic tabs. That is, the first malignant breast cancer diagnosed for each female within the five-year period prior to the prevalence date (January 1, 2018).
- When calculating prevalence percents, we need a population estimate for the prevalence date. Populations provided with SEER databases are mid-year population estimates. The January 1, 2018 populations are estimated by averaging 2017 and 2019 populations.
- Our prevalence session for this exercise will include fewer survival cohort variables than usual. Since this is a sex- and site-specific analysis, we do not need to include sex or site as survival cohort variables. Our selections will only include female breast cancer cases for the entire analysis including survival. Therefore, the survival estimates used in this exercise will be appropriate. Likewise, we are only including 5 years of data in the analysis; this period falls into one grouping in the standard year variable used for survival cohorts.
Step 1: Create a New Prevalence Session
- Start SEER*Stat.
- From the File menu, select New > Limited-Duration Prevalence Session, or use the on the toolbar.
Step 2: Data Tab
On the Data Tab:
- Select the database "Incidence - SEER Research Data, 18 Registries, Nov 2020 Sub (2000-2018)".
Step 3: Selection Tab
On the Selection tab:
- Edit the "Race, Sex, Registry, County (Pop, Case Files)" selection statement to read:
- Edit the "Other (Case Files)" selection statement to read:
- Set the Multiple Primary Selection drop-down list to "First Primary Matching Selection Criteria". Because of the Multiple Primary Option used in this exercise, the Malignant Only standard checkbox will have no impact on this analysis, however, we will leave it checked.
- The Exclude Death Certificate Only and Autopsy Only Cases option is always on in Limited-Duration Prevalence sessions. Since SEER*Stat uses the counting method, these cases are never considered prevalent. This option prevents them from being included in the survival calculations.
{Race, Sex, Year Dx.Sex} = 'Female'
{Site and Morphology.Site recode ICD-O-3/WHO 2008} = 'Breast'
Step 4: Statistic Tab
On the Statistic tab:
- Set the Prevalence Date to January 1, 2018.
- Limit the Prevalence Duration to 5 years.
- Now your analysis is set up to calculate the number of individuals who were alive on January 1, 2018, and who were diagnosed with cancer during the five years prior to that date.
- Set the Statistic to Crude Percent (counts will be included in the results).
- This option specifies that your results matrix will include the number of prevalent cases on this date (the count), an estimate of the total population on that date, and the proportion of the population that the count represents.
- Mark the checkbox to Display By Time Prior to Prevalence Date.
- Set the Cut-Points to "Number of Years Prior", and enter "1,2,3,4" in the field.
- Select Discrete (Non-Overlapping) Intervals.
- The preceding options set up your analysis to include the five one-year discrete time intervals mentioned in the problem statement. This will allow you to see how many of the prevalent cases on January 1, 2018 had been diagnosed one, two, three, four, and five years prior to that date.
- Mark the checkbox to Show Standard Errors and Confidence Intervals, and set the Confidence Limits to "95%".
Step 5: Table Tab
No variables are needed in the page, row, or column on the Table tab. The time intervals will be shown in the table rows.
Step 6: Survival Cohorts Tab
Add the following system-supplied variables to the Cohort Variables box:
- Age recode (<60, 60-69, 70+) - based on the age at diagnosis recode variable (Age recode with < 1 year olds)
- Race and origin recode (NHW, NHB, NHAIAN, NHAPI, Hispanic) - no total
Step 7: Output Tab
On the Output tab:
- Enter a title for your matrix, such as:
SEER 18, 5-year Limited Duration Prevalence Estimates
Malignant Breast Cancers, Females Only
First Malignant Breast Cancer Diagnosed Within the Previous 5 Years
By Time Since Diagnosis - Mark the Survival Summary Used for Lost Cases and Detailed Survival Used for Lost Cases checkboxes. These will add tables to your matrix that display the survival statistics used to adjust for cases lost to follow-up prior to the prevalence date.
Step 8: Execute the Session
- Use the ,or select Execute from the Session menu, to execute the session. You will see two warnings:
- The first one will warn you that your session's Year variable might not be represented in the survival cohort table, and that the results may therefore be misleading. In this case, the Year variable is indeed not represented. Since we are only using five years of data, we have intentionally chosen not to calculate survival by year. Click OK to continue.
- The second warning is displayed because you chose the "First Primary Matching Selection Criteria" option on the Selection tab. It reminds you that, in a Limited-Duration Prevalence session, the "selection criteria" includes not only the statements on the Selection tab, but also the Prevalence Date and Prevalence Duration that you defined on the Statistic tab. Click OK to continue.
- Note that this second warning has a checkbox labeled Do not show this message in future. You can mark this checkbox before clicking OK to prevent this warning from being displayed when you execute future sessions. If you have done so in the past, it will not be displayed now.
- A dialog will display the progress of the job. When the job completes, a new window will open containing the output table or matrix.
Step 9: Check Your Results
Compare your results to this SEER*Stat matrix file: key.prev2a.spm. In this key, additional statistics are shown (you can use Matrix Options to display these statistics).