 | SEER*Stat Prevalence Exercise 2a
Problem Statement
Create a table showing the number of women in the SEER 17 Registries diagnosed with a
malignant breast cancer in the 5-year period prior to January 1, 2005. That is, create a
table showing January 1, 2005, 5-year limited-duration prevalence (crude percents and
counts) for the SEER 17 Registries. Include only one primary for each woman.
Show the results by time prior to prevalence date. Include five one-year discrete time
intervals prior to prevalence date: less than or equal to 1 year, greater than 1 but less
than or equal to 2 (denoted in reports as >1-2), >2-3, >3-4, and >4-5 years.
Since we are only interested in the first malignant breast cancer diagnosis in the entire
5-year period, a woman will count once in the total period and in only one interval.
Include standard errors, confidence intervals, and survival tables in the results.
Key Points and Reminders
The selection of cases differs somewhat in Limited-Duration Prevalence sessions compared
to other sessions. The process of selecting records in prevalence sessions involves both
the Selection Tab and certain settings on the Statistic Tab. Specifically, selections
related to the date of diagnosis are made based on the settings for prevalence date and
duration. Both of these are set on the Statistic Tab.
The database used in this exercise has "+ Hurricane Katrina Impacted Louisiana Cases" and
"<Katrina/Rita Population Adjustment>" in the name. In prevalence, the Katrina impacted
cases and Katrina/Rita adjusted populations in this database have no impact on the analysis. This is because
the latest available prevalence date in the database is July 1, 2005, which is prior to the hurricanes.
In this exercise, we will use "First Primary Matching Selection Criteria" as the
Multiple Primary Selection option. This will select the first primary cancer that matches
the selection criteria on both the Selection and Statistic tabs. That is, the first
malignant breast cancer diagnosed for each female within the five-year period prior to the
prevalence date (January 1, 2005).
When calculating prevalence percents, we need a population estimate for the prevalence
date. Populations provided with SEER databases are mid-year population estimates. The
January 1, 2005 populations are estimated by averaging 2004 and 2005 populations.
Our prevalence session for this exercise will include fewer survival cohort variables than usual.
Since this is a sex- and site-specific analysis, we do not need to include sex or site as survival cohort
variables. Our selections will only include female breast cancer cases for the entire
analysis including survival. Therefore, the survival estimates used in this exercise will
be appropriate. Likewise, we are only including 5 years of data in the analysis; this
period falls into one grouping in the standard year variable used for survival
cohorts.
Step 1: Create a New Prevalence Session
- Start SEER*Stat.
- From the File menu, select New > Limited-Duration Prevalence Session, or use the
on the toolbar.
Step 2: Data Tab
On the Data Tab:
- Select the database "Incidence - SEER 17 Regs Limited-Use + Hurricane Katrina Impacted Louisiana Cases, Nov 2007 Sub (2000-2005) + Hurricane Katrina Impacted Louisiana Cases".
Step 3: Selection Tab
On the Selection tab:
- Edit the "Race, Sex, Registry, County (Pop, Case Files)" selection statement to read:
{Race, Sex, Year Dx, Registry, County.Sex} = 'Female'
- Edit the "Other (Case Files)" selection statement to read:
{Site and Morphology.Site rec with Kaposi and mesothelioma} = 'Breast'
- Set the Multiple Primary Selection drop-down list to
"First Primary Matching Selection Criteria". Because of the Multiple Primary Option used in this exercise, the Malignant Only standard checkbox will have no impact on this analysis, however, we will leave it checked.
- The Exclude All Death Certificate and Autopsy Only option is always on in
Limited-Duration Prevalence sessions. Since SEER*Stat uses the
counting method, these cases are
never considered prevalent. This option prevents them from being included in the survival calculations.
Step 4: Statistic Tab
On the Statistic tab:
- Set the Prevalence Date to January 1, 2005.
- Limit the Prevalence Duration to 5 years.
- Now your analysis is set up to calculate the number of individuals who were alive on January 1, 2005,
and who were diagnosed with cancer during the five years prior to that date.
- Set the Statistic to Crude Percent (counts will be included in the results).
- This option specifies that your results matrix will include the number of prevalent cases on this date (the count),
an estimate of the total population on that date, and the proportion of the population that the count represents.
- Mark the checkbox to Display By Time Prior to Prevalence Date.
- Set the Cut-Points to "Number of Years Prior", and enter "1,2,3,4" in the field.
- Select Discrete (Non-Overlapping) Intervals.
- The preceding options set up your analysis to include the five one-year discrete time intervals mentioned
in the problem statement. This will allow you to see how many of the prevalent cases on January 1, 2005 had been
diagnosed one, two, three, four, and five years prior to that date.
- Mark the checkbox to Show Standard Errors and Confidence Intervals, and set the
Confidence Limits to "95%".
Step 5: Table Tab
No variables are needed in the page, row, or column on the Table tab. The time intervals will be shown in the table rows.
Step 6: Survival Cohorts Tab
Create the following two new user-defined variables and add them as Cohort Variables on the Survival Cohorts tab:
- Age recode (<60, 60-69, 70+)
- Race recode (W, B, AI, API, Oth/Unspec)
Step 7: Output Tab
On the Output tab:
- Enter a title for your matrix, such as:
SEER 17, 5-year Limited Duration Prevalence Estimates
Malignant Breast Cancers, Females Only
First Malignant Breast Cancer Diagnosed Within the Previous 5 Years
By Time Since Diagnosis
- Mark the Survival Summary Used for Lost Cases and
Detailed Survival Used for Lost Cases checkboxes.
These will add tables to your matrix that display the survival statistics used
to adjust for cases lost to follow-up prior to the prevalence date.
Step 8: Execute the Session
- Use the
,
or select Execute from the Session menu, to execute
the session. You will see two warnings:
- The first one will warn you that your session's Year variable might not be represented in the
survival cohort table, and that the results may therefore be misleading.
In this case, the Year variable is indeed not represented. Since we are only using five years of data, we
have intentionally chosen not to calculate survival by year.
Click OK to continue.
- The second warning is displayed because you chose the
"First Primary Matching Selection Criteria" option on the Selection tab. It reminds you that, in a
Limited-Duration Prevalence session, the "selection criteria" includes not only the statements on the
Selection tab, but also the Prevalence Date and Prevalence Duration
that you defined on the Statistic tab. Click OK to continue.
- Note that this second warning has a checkbox labeled
Do not show this message in future. You can mark this checkbox before clicking
OK to prevent this warning from being displayed when you execute future sessions.
If you have done so in the past, it will not be displayed now.
- A dialog will display the progress of the job. When the job completes, a new window will open
containing the output table or matrix.
Step 9: Check Your Results
Compare your results to this SEER*Stat matrix file: key.prev2a.spm.
In this key, additional statistics are shown (you can use Matrix Options to display these statistics).
|