An official website of the United States government

Frequency Exercise 2: Cancer Site and Customizing Results

left-img

This exercise introduces three SEER variables related to cancer type and how to customize the results matrix.

The site recode variables are derived from the "Primary site" and "Histologic type ICD-O-3" variables after the registries submit the data to SEER (see Site Recode ). These variables are added to the SEER data as a convenience and are used in most SEER publications to define the cancer of interest.

For example, the cancer sites available in SEER*Explorer correspond to a value of a site recode variable.

Exercise

Create a table showing frequencies by primary site for cases with Site recode ICD-O-3/WHO 2008 = lung and bronchus. Include only malignant cases diagnosed in the SEER 22 registries from 2000 through 2021 and exclude cases with unknown age.

Key Points

  • This exercise introduces three variables related to cancer type in the SEER databases (primary site, site recode, histology).
  • In this exercise, you will calculate frequencies for cases diagnosed between 2000 and 2021, the only years for which all 22 registries have cases in the database.
  • You will set primary site as a display variable to create a table showing frequencies for the variable's individual values.
  • The matrix Actions menu provides a variety of options for modifying the layout of the output table. This exercise uses the Hide Zero Count Rows option to suppress the display of primary site codes not relevant to lung and bronchus cancer.

Instructions

Step 1:  Create a New Frequency Session and Select a Database

  1. Start SEER*Stat.
  2. Select the Frequency button from the New Session menu.
  3. On the Select Database dialog, select "Incidence - SEER Research Limited-field Data, 22 Registries, Nov 2023 Sub (2000-2021)" and press the OK button. 
  4. Press the OK button on the Linked Database Change warning, if it opens.

Databases distributed with SEER*Stat use names designed to describe the data. The various parts of this exercise's database name indicate the following:

  • Incidence - The database contains cancer incidence data.
  • SEER Research Limited-Field Data - This indicates the database type and which variables are included, as described in the data dictionary available on SEER*Stat Database Details.
  • 22 Registries - The database contains data for the "SEER 22" registries as defined in SEER Registry Groupings for Analyses.
  • Nov 2023 Sub - The data was submitted to the SEER program by the registries in November 2023.
  • (2000-2021) - These are the years of diagnosis for the cases included in the database.

Step 2:  Choose the Statistics to Display

  1. Select Statistic from the sidebar menu.
  2. In the Percentages box, select Column.

Step 3:  Define the Analysis Cohort

In this exercise, we want a frequency of malignant lung and bronchus cancer cases diagnosed from 2000 through 2021.   We do not want to include all of the cancer sites included in the database. Therefore, we need a selection statement based on site and behavior.  It is very common to select only malignant behavior when analyzing cancer data. For this exercise we want to select malignant cases in the research database and exclude those with unknown age. The database selected contains cases diagnosed in the SEER 22 registries; therefore selections based on registry are not necessary.

  1. Choose Selection from the sidebar menu.
  2. The Select Only box provides a shortcut for commonly-used selections. Make sure that the Malignant Behavior and Known Age options are checked.
  3. Click Edit to open the Case Selection dialog.
  4. Select the New Line button to open the Case Selection Line dialog. Available variables are listed in categories in the Variable box on the top left of the screen.
  5. In the Variable box, use the "+" to expand the "Site and Morphology" category.
  6. Select "Site recode ICD-O-3/WHO 2008".
  7. Moving to the center of the window, check that "is = to" is selected as the Operator.
  8. Scroll through the items in the Values box until you find and select "Lung and Bronchus".
  9. The following should appear in the Selection Statement box at the bottom of the window: {Site and Morphology.Site recode ICD-O-3/WHO 2008} = '    Lung and Bronchus'
  10. Use the OK button to close the Case Selection Line dialog. The one-line selection statement is shown on the Case Selection dialog where additional lines could be added if necessary.
  11. Select the OK button to close the Case Selection dialog.

The exercise specified to create a table showing the frequency of primary site for malignant cases with site recode = lung and bronchus. It is possible to select lung and bronchus cases by creating a selection statement that uses primary site and histology. However, it is much easier to use one of the site recode variables if you do not want to include hematopoietic diseases (such as lymphomas and leukemias) in the table.

Step 4:  View the Primary Site Variable Groupings

  1. Select Table from the sidebar menu.
  2. Use the "+" to expand the "Site and Morphology" category in the Available Variables box.
  3. Select "Primary Site - labeled".
  4. Open the Dictionary editor to view the groupings for "Primary site". In a previous exercise we learned that the dictionary could be opened by selecting the Dictionary button from the Actions menu, or double-clicking on a variable listed in the Available Variables box or the Display Variables box.
  5. If it is not already selected in the Dictionary dialog, select the "Primary site - labeled" variable from the "Site and Morphology" category.
  6. The Create button will be enabled when a variable is selected. Use the Create button to open the Edit Variable dialog. The values of the "Primary Site - labeled" variable correspond to ICD-O-3 codes. There is one grouping for each value of the primary site variable.
  7. Click the Cancel button to close the Edit Variable dialog without making any changes.
  8. Click the Close button to close the dictionary.

Step 5:  Set the Row Variable

  1. Use the "+" to expand the "Site and Morphology" category in the Available Variables box.
  2. Select "Primary site - labeled".
  3. Click the Row button. At this time, "Primary site - labeled" should be listed as a row variable in the Display Variables box.

Step 6:  Specify a Title

  1. Select Output from the sidebar menu.
  2. Enter the following Title:

Primary Site Frequencies for Malignant Lung and Bronchus (Site recode ICD-O-3/WHO 2008)
SEER 22 Registries, 2000-2021
Frequency Exercise 2

Step 7:  Create the Matrix and Hide Extraneous Rows

The large number of matrix table rows makes it difficult to review the relevant frequencies. SEER*Stat has an option that allows you to suppress the display of the rows with a value of zero in each column.

  1. Select the Execute button from the Actions menu. A dialog will display the progress of the job. When the job completes a SEER*Stat matrix window will open containing the results. The results matrix contains one line for each value of the primary site variable.
  2. Select Options from the Actions menu to open the Matrix Options dialog.
  3. In the Options box check Hide Zero Count Rows.
  4. Click the OK button. The size of the table will be reduced to seven rows: one row for each of the primary site ICD-O-3 codes for lung and bronchus, and one for totals.
  5. Compare your results to this SEER*Stat matrix file: Exercise Matrix 2 Results (sfm, 66.2 KB).

  • The Matrix Options dialog gives you the opportunity to correct typographical errors in the title. Corrections that you make to the title in the matrix options will appear in the matrix and printed output. However, sessions extracted from the matrix will retain the original, unedited title.
  • Learn about the other matrix options.
right-img