SEER*Stat Rate Exercise 1b: Avoiding a Common Mistake
Create a matrix showing incidence rates for malignant lung cancer stratified by year of diagnosis. As in Rate Exercise 1a, the rates should be for persons age 65 and older diagnosed from 2010 through 2014 in the SEER 18 Registries. Age-adjust the rates using the 2000 U.S. standard population.
This exercise is the same as Rate Exercise 1a except that the statistics should be shown by year of diagnosis in addition to sex.
- A unique dictionary is associated with each SEER*Stat database. The dictionary contains pre-defined variables (known as "standard variables") which describe the data fields in the database. The standard variables are formatted for common use. For example, the year of diagnosis variable typically includes one grouping for all years combined and a separate grouping for each individual year. That format is appropriate when analyzing all years of data in the database. For this exercise, you will need to create a user-defined version of the year of diagnosis variable.
- This packet contains two sample solutions -- one in which the standard "Year of Diagnosis" variable was used, and one in which a user-defined variable with the correct year groupings was used. Note the mislabeled results in the all years row of the "with.error" matrix.
- SEER*Stat provides a convenient mechanism for calculating statistics similar to those previously calculated. You can use the work that you did in Rate Exercise 1a as a starting point.
Step 1: Open Exercise 1a's Matrix and Extract the Session
- Open the file saved in exercise 1a. The filename should be "rate exercise 1a.sim".
- If you did not save the output for exercise 1a you may open our version of Rate Exercise 1a.
- SEER*Stat matrix files include the session information used to generate the table. This information serves as documentation for the results and provides a convenient method for generating similar statistics.
- From the Matrix menu select Retrieve Session
- Two windows should now be open. Close the matrix window containing the results calculated in exercise 1a. You should now have one window labeled "Rate Session-x" where x is the number of rate session windows that you have created since starting SEER*Stat.
SEER*Stat is a Multiple Document Interface (MDI) application. MDI applications allow you to work with more than one document at a time. The fact that SEER*Stat is an MDI application means that you can have any number and combination of session and matrix windows open at the same time.
Step 2: Set Table Variables (Table Tab)
- On the Table Tab, move the sex variable from the row to the column dimension. To do this, select "Sex" in the Display Variables box and use the Move Down button to move the variable to the column dimension.
- Add "Year of diagnosis" from the "Race, Sex, Year Dx, Registry, County" category as the row variable.
- The variables you are selecting are the standard database variables and their pre-defined groupings will appear in the output matrix. Refer to Working with Variables in the SEER*Stat help system for more information.
Step 3: Execute SEER*Stat
- Use the or select Execute from the Session menu to execute the session.
- Examine the output carefully. You may compare your results to this SEER*Stat matrix file: Results Matrix for Exercise 1b (with error).
- Remember that you calculated rates for persons diagnosed from 2010 through 2014. The data for cases diagnosed from "2000-2009" were excluded from the analysis on the Selection Tab.
- The "Year of diagnosis" variable defined in the SEER database includes groupings (labels or formats) for "2000-2014" combined and for the individual years from "2000" through "2014".
- When you use the SEER standard variable "Year of diagnosis" in a table statement, your table will include output by each of these groupings (2000-2014 combined, 2000, 2001, ..., 2014) which is not appropriate for this exercise.
- The rows labeled 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, and 2009 are correct but extraneous. You could use the "Hide Zero Count Rows" feature in Matrix options to suppress the display of these rows, but the row labeled "2000-2014" is incorrect. This row actually only shows the rates for "2010-2014".
Step 4: Return to the Session
- If you did not close the Session window after executing SEER*Stat, the Session will still be open and you may close the matrix for this exercise.
- If you closed the Session window, select Retrieve Session from the Matrix menu.
Step 5: Open the Data Dictionary
For this exercise, you need to create a user-defined variable based on the "Year of diagnosis" standard variable.
- There are several ways to open the SEER*Stat dictionary editor. Open the dictionary now by:
- selecting Dictionary from the File menu
- using the on the toolbar; or
- double-clicking on the "Year of diagnosis" variable listed in the Available Variables box or the Display Variables box on the Table Tab.
- The method used to open the data dictionary is strictly a matter of personal preference. When the data dictionary is opened by double-clicking a variable, that variable is highlighted in the Dictionary window. That can save one step if you are creating a user-defined variable based on the selected variable.
Step 6: Edit the "Year of diagnosis" Variable
- The Dictionary window should now be open.
- If it is not already selected, select the "Year of diagnosis" variable from the "Race, Sex, Year Dx, Registry, County" category.
- Use the Create button to open the Edit Variable window. You will be creating a new variable by editing "Year of diagnosis" and saving the revised variable with a new name.
These are the main features and controls of the Edit Variable window:
- Name - Every variable in the dictionary must have a unique name.
- Groupings - A grouping is a group of values with an associated label. When you click on a label in the Groupings box, the values associated with the label will be displayed in the Values box. Groupings are essentially format statements that allow you to label individual or groups of values. Throughout these exercises you will be adding and deleting groupings to create tables.
- Values - All values occurring in the database for the variable are listed. The values for most variables will be listed with descriptive labels. The list of values can not be changed, it is determined when the database is created. Select an individual or a set of values when creating a grouping. The groupings allow you to define the format of the variable in the SEER*Stat matrix output.
Step 7: Create a New Year of Diagnosis Variable
- Edit the Name field and give the variable this name: "Year of diagnosis (2010-2014)".
- You will develop your own naming conventions as you become more experienced with SEER*Stat. You will find that some variables are generic and can be used for a variety of sessions. By using meaningful variable names you will easily be able to identify the variables in your data dictionary.
- Select the "2000" grouping. Hold down the Shift key and select the "2009" grouping. 2000 through 2009 will be selected.
- Use the Delete button below the Groupings box to delete the selected groupings.
- Select the "2000-2014" grouping. Notice that the values associated with this grouping are highlighted in the Values box.
- Click on the "2010" value from the Values box. Hold down the left mouse button, drag through "2014" to select the years 2010-2014. Click the Update button.
- Use the Rename button at the bottom of the Groupings box to rename this grouping "2010-2014".
- Click OK.
- You will notice that a new category, "User-Defined" was been added to the dictionary if it was not already there from a previous exercise. Click Close to close the dictionary.
- Notice that when you changed the values associated with the "2000-2014" grouping, the grouping name did not change. You have to give a new name to the grouping. It is very important to give a meaningful name to your groupings as they are the labels used in your output matrix.
Step 8: Replace the Row Variable
- In the Display Variables box on the Table Tab remove "Year of diagnosis" from the row. To do this, select the variable and use either your Delete key or the Remove button on the right side of the screen.
- Use the "+" to expand the "User-Defined" category.
- Select the new year of diagnosis variable that you created in the last step and click the Row button.
- At this time, your newly created year of diagnosis variable should be listed as a row variable in the Display Variables box at the top of the window.
Step 9: Execute SEER*Stat and View the Results
- Use the or select Execute from the Session menu to execute the session.
- Notice the years 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, and 2009 have been removed from the output. Also, the label for the combined years is now correctly labeled, "2010-2014".
- Use the Save As command on the File menu to save the matrix for use in Rate Exercise 2. Enter "Rate Exercise 1b" as the filename. SEER*Stat will assign the "sim" extension to indicate that this is a "SEER*Stat Rate Matrix" file.
- Compare your results to this SEER*Stat matrix file: Exercise 1b Results Matrix.