In SEER*Stat, a variable is a dictionary entry which contains format groupings assigned to data values (values of data fields in the tumor record). Variables provide a mechanism for grouping individual data values into one statistic ("Ages 65+" is a grouping commonly used for analyses by age). A unique dictionary of variables is associated with each SEER*Stat database. The following terms are used for the 4 types of SEER*Stat variables:

  • Standard Variable - Standard variables are the variables distributed in a database, they cannot be modified or deleted in SEER*Stat. These are created when the SEER*Stat database is created by processing the text data via SEER*Prep. A default set of format groupings are defined for most standard variables. For example, the year of diagnosis variables in the SEER 9 databases are distributed with one grouping for all years combined (1975-2018) and separate groupings for each individual year. Some variables, like histologic type, do not have any default formats, just a range of unlabeled values.
  • User-defined Variable - a variable that you define based on one variable. Typically, you would create a user-defined variable based on a Standard Variable. For example, the "1975-2018" grouping for year of diagnosis would be inappropriate if you selected only 2000-2018 for your analyses. You would need to create a user-defined version based on year of diagnosis with separate groupings for the individual years (2000, 2001,...,2018) and, if desired, a single grouping for "2000-2018" combined.
  • Merged Variable - In essence, a Merged Variable is a type of user-defined variable. A Merged Variable is a variable whose format groupings can be based on 2 or more variables. As an example, this may be used to create groupings to show cancer in specific age-sex combinations: "Women <50", "Women 50+", "Men <65", "Men 65+".
  • Calculated Variable - A calculated variable is a variable that is not coded in the database; that is, the field is not on the tumor record. SEER*Stat determines its values based on the values of other variables and system specifications. For example, Age at Prevalence Date is a calculated variable used in Limited-Duration Prevalence sessions. The values of this variable are determined from the prevalence date selected for the analysis and either date of birth or date and age at diagnosis. User-defined variables based on calculated variables are displayed in a separate folder labeled “Calculated.”

Using SEER*Stat Variables

The following tutorials will help you learn to use the different variable types in SEER*Stat sessions.

Standard Variables:
Frequency Exercise 1a
Rate Exercise 1a
User-defined Variables:
Frequency Exercise 1b
Rate Exercise 1b
Merged Variables:
Rate Exercise 3
Calculated Variables:
Prevalence Exercise 1