Create lists of tumors, not lists of people, and explore formats of variable values in the output.
Case listing sessions are used to create tables showing the actual values stored in the database. They provide a mechanism for displaying individual cancer records and patient histories.
Exercise
Create a table showing post-menopausal (age 50+) female malignant breast cancer cases diagnosed in Iowa during the 2017-2021 time period. Include patient ID, sequence number, some demographic fields, cancer type fields, and several breast cancer specific fields.
You must have access to the Research Plus data to complete this exercise as written for cancer cases in the Iowa registry. Refer to Comparison of SEER Data Products for more information. If you do not have access to Research Plus data, you can still complete the exercise by using the alternative case listing file which does not limit cases based on registry.
Key Points
- This exercise illustrates the use of SEER*Stat to view variables for individual cases, that is, a list of tumors, not a list of people. There may be multiple tumors listed per person.
- In the results matrix, you will be able to show the variables in a label format or switch to numeric codes (unformatted). If exporting the data for analysis in other software, you may want to use unformatted variables.
- Starting with the 1975-2017 SEER Data, there are two data products available: SEER Research and SEER Research Plus. The Research Plus databases provide access to additional variables, such as geographic region, and require a more rigorous authorization process. Refer to Comparison of SEER Data Products for more information.
Instructions
Step 1: Create a New Case Listing Session
- Start SEER*Stat.
- Select the Case Listing button from the New Session menu toolbar.
Step 2: Select a Database
It is extremely important that you select the database as the first step. The correct database must be selected in order to see the correct list of variables in selection statements, table statements, and the dictionary editor. The exercise instructs us to create a list of cases diagnosed in Iowa during the 2017-2021 time period. Iowa cases for this time period are included in all SEER databases.
- On the Select Database dialog, select "Incidence - SEER Research Plus Data, 17 Registries, Nov 2023 Sub (2000-2021)" and press the OK button. (Use the "Incidence - SEER Research Data, 17 Registries, Nov 2023 Sub (2000-2021)" database if you do not have access to the Research Plus data.)
- Press the OK button on the Linked Database Change warning, if it opens.
See SEER Registries for a description of each registry. In the Research Plus databases, you may check the registries contained in a database by viewing the "SEER registry" variable in the dictionary editor. This variable is not available in the Research databases. To view the registries for a database follow these steps:
- Select the database from the Select Database dialog which opens from the Data sidebar menu Change Selected Database button.
- Open the dictionary by selecting the Dictionary button from the Actions menu.
- Use the "+" to expand the "Race, Sex, Year Dx, Registry, County" variable category.
- Select "SEER registry".
- Double-click "SEER registry" or use the Create button to edit this variable.
- The values of this variable will vary from one database to another. Databases distributed with the SEER*Stat software include "SEER 8", "SEER 17", etc., in the database name to indicate the set of registries included (see Registry Groupings in SEER Data and Statistics).
- Use the Cancel button to exit the Edit Variable dialog and then close the dictionary.
Step 3: Create the Selection Statement
Selection statements reduce the number of records based on specific variables. If no selection statements are made, all records in the database will be included in the table. Consider each element of the Problem Statement:
- post-menopausal (age 50+) - requires a selection based on age at diagnosis;
- female - requires a selection based on sex;
- malignant - requires a selection based on behavior;
- breast cancer cases - requires a selection based on cancer site;
- diagnosed in Iowa - requires a selection based on SEER registry;
- during the 2017-2021 time period - requires a selection based on year of diagnosis;
The remainder of the problem statement says to "include patient ID, age, race, etc." in the table. We will specify that these variables are to be shown for each record in the Table options.
- Click Selection from the sidebar menu.
- Because it is a very common selection, a checkbox is provided in the Select Only box to include only malignant behavior cases in your analysis. Make sure that the Malignant Behavior option is checked for this problem. The Known Age option in the Select Only box is irrelevant for this session because we will exclude cases based on age in the selection statement.
- Use the Edit button to open the Case Selection dialog and press the New Line button. The Case Selection Line dialog opens.
- In the Variable box, use the "+" to expand the "Age at Diagnosis" category and select "Age recode with <1 year olds".
- Moving to the center of the window, check to see that "is = to" is selected as the Operator.
- Scroll through the items in the Values box until you find "50-54 years". Select all age groups from "50-54 years" through "85+ years".
- At this time, the following should appear in the Selection Statement box at the bottom of the window:
{Age at Diagnosis.Age recode with <1 year olds} = '50-54 years','55-59 years','60-64 years','65-69 years','70-74 years','75-79 years','80-84 years','85+ years'. - Press the OK button. The statement line is listed in the Case Selection dialog.
- Make sure the "And" logical operator is selected and press the New Line button to add a second line to the selection statement. The Case Selection Line dialog opens.
- In the Variable box, use the "+" to expand the "Race, Sex, Year Dx, Registry, County" category and select "Sex".
- Make sure the Operator is "is = to" and select the "Female" value to create the {Race, Sex, Year Dx, Registry, County.Sex} = ' Female' Selection Line Statement. Press the OK button.
- Add each of the following statements in turn. The syntax shown below is {Variable category.Variable name} = 'Variable Value'. (Exclude the final statement line if you are not using the Research Plus database.)
- And {Race, Sex, Year Dx, Registry, County.Year of diagnosis} = '2017','2018','2019','2020','2021'
- And {Site and Morphology.Site recode ICD-O-3/WHO 2008} = ' Breast'
- And {Geographic Locations.SEER Registry (with CA and GA as whole states)} = 'Iowa'
- Select the OK button to close the Case Selection window.
Some geographic variables are available in the Research Plus databases but are not available for use in the case listing sessions. To select cases in Iowa, you used the "SEER registry (with CA and GA as whole states)" variable. Other geographic variables in the Race, Sex, Year Dx, Registry, County category cannot be used in case listing sessions. If you try to execute a session using one of these variables, you will get a SEER*Stat warning. Refer to Comparison of SEER Data Products for more information.
Step 4: Set Table Variables
Since the Selection and Table options are often confused, please remember to use the Table options to choose variables to include in the output table. Use the Selection options to reduce the number of records based on specific variables. This exercise specifies that the table should include several variables: patient ID, age, race, year of diagnosis, etc. Age is used on both the Selection and Table options. The Selection options are used to select ages 50+; the Table options are used to specify that the value of age for each record is to be included in the results matrix.
- Select Table from the sidebar menu.
- The variables are listed in categories in the Available Variables box at the bottom of the screen.
- Use the "+" to expand a category, select a variable, and press the Column button to add each of the following variables as a column Display Variable. Note that the category is listed to the left of the period. The variable name is listed to the right of the period. This syntax is used in SEER*Stat reports that show variable names.
- Other.Patient ID
- Multiple Primary Fields.Sequence number
- Race and Age (case data only).Age recode with single ages and 90+
- Race, Sex, Year Dx, Registry, County.Race recode (W, B, AI, API)
- Race, Sex, Year Dx, Registry, County.Origin recode NHIA (Hispanic, Non-Hisp)
- Race, Sex, Year Dx, Registry, County.Race and origin recode (NHW, NHB, NHAIAN, NHAPI, Hispanic)
- Race, Sex, Year Dx, Registry, County.Year of diagnosis
- Site and Morphology.Primary Site - labeled
- Site and Morphology.Histologic Type ICD-O-3
- Site and Morphology.ICD-O-3 Hist/behav, malignant
- Site and Morphology.Laterality
- Extent of Disease.Breast Subtype (2010+)
- Extent of Disease.ER Status Recode Breast Cancer (1990+)
- Extent of Disease.PR Status Recode Breast Cancer (1990+)
- Extent of Disease.Derived HER2 Recode (2010+)
- Use the Find... button if you do not know the name of a variable or you do not know its category. You can search the variable list by format/grouping label (e.g., "malignant") or variable name (e.g., "histologic"). Type at least three characters in the Search Text box and any results containing that text will appear as you type.
- In the Research Plus data there is an age field with single ages through 99 and 100+, but this field cannot be included in case listings, so we use the field with 90+ combined.
Step 5: Specify a Title
- Select Output from the sidebar menu.
- Enter the following title:
Female Malignant Breast Cancer Cases in Iowa
2017-2021 Diagnoses, Ages 50+
Case Listing Exercise 1a
Step 6: Execute SEER*Stat and Save the Matrix
- Review each option set and verify that the settings are correct.
- Execute the session.
- Press the OK button on any variable warnings, if they open. A new window will open containing the output table or matrix.
- Use the Save As command on the File menu to save the matrix. Enter "Case Listing Exercise 1a" as the filename. SEER*Stat will assign the "slm" extension to indicate that this is a "SEER*Stat Case Listing Matrix" file.
Step 7: Change the Variable Display Format
If exporting the data for analysis in other software, you may want to use unformatted variable codes rather than the default label format.
- Right-click on a column header in the matrix results. A menu of column options opens.
- Select Display As and then Unformatted to view the column values as numeric codes.
Results shown in the SEER*Stat matrix window cannot be edited. You can print the matrix, export the results to a text file, and copy-and-paste data into other applications. See Results Matrix for more information about the SEER*Stat matrix and its features.
Step 8: Check the Results
- Compare your results by executing this SEER*Stat session file: Case Listing Exercise 1a Session. (Use Case Listing Exercise 1a Session ALT (sl, 69.1 KB) if you do not have access to the Research Plus Data.) Case listing examples are provided as session files instead of results matrices to ensure they may only be opened by those who have signed a SEER data use agreement and have permission to view individual records.