How can I create a database that I can use to calculate age-adjusted rates?

The steps for creating a database are described in How to use SEER*Prep.

In the derived variable Cause of Death Recode, how are the site groupings defined?

SEER*Prep uses the 1969+ definitions presented with the SEER Cause of Death Recode.

Can I use ICD causes of death coded at the 3-digit level when creating a mortality database?

ICD causes of death coded at the 3-digit level can be used for both ICD-9 and ICD-10 codes. SEER*Prep uses these to create a recode (SEER Cause of Death Recode [3-digit]).

Is there a way to move databases created by SEER*Prep?

You can move databases created by SEER*Prep. However, it is more efficient and less error-prone to create databases on a local area network initially. Then SEER*Stat users should create a local Data Location with the shared path.

To move a database, you must first check to see where you created the database. SEER*Prep creates databases in the user-defined data directory. Open the File menu and select Preferences to determine that path.

Exit SEER*Prep. Use Windows Explorer to copy the entire user-defined data directory to another location. You must move the entire directory and all of its contents, but you will be able to remove unwanted databases from the new location later.

Each person wishing to use the database will need to have its location set up as a local Data Location in SEER*Stat. Instruct them to select Preferences from SEER*Stat's File menu, then click Add Local to create a new Data Location with that path. The databases will appear when a new session is created or opened in SEER*Stat.

Use the Database Management feature to delete any databases that you do not want in the new location. Proceed as follows to make sure you delete from the new location, not the old. First, start SEER*Prep. Next, set the Directory to Save User-Defined Databases to the new location. Finally, use the Database Management command on the File menu to delete the unwanted databases.

What is a .txd.gz file?

A .txd.gz file is a compressed text file created by the GzipExternal Web Site Policy compression utility.

In SEER*Prep, input data must be stored in either text or compressed text files. If the input file is a text file then it must be named with a .txd extension. Use a compressed format if you wish to reduce the disk space required to store your input data. GzipExternal Web Site Policy, a free utility, creates files using the only compression format supported by SEER*Prep. SEER*Prep requires gzipped data files to have a .txd.gz extension.

What does this SEER*Prep error message mean?
"ERROR: Line termination character not found in the first record of <filename> with record length <value>."

Most likely, your data file does not contain fixed-length records. Input File Format provides a list of the record lengths required for the supported file formats.

If your data files are not fixed-length or the records are not the correct length, you must modify the files before using SEER*Prep. See if your data management software has an option to output fixed-length records or use one of the "fix length" programs listed in SEER*Prep Utilities. All records in an input file must be the same length.

I have regions defined as groups of counties. How can I use SEER*Prep to create a geographic variable using groups of counties rather than individual counties?

No changes are needed when running SEER*Prep; this can be handled in SEER*Stat. Use SEER*Prep to create the database using county as a population variable. In SEER*Stat, create a user-defined variable based on county to define your geographic regions. Use the new user-defined variable in your analyses.

My population data is by ZIP code rather than county. How can I use my ZIP code variable as a population variable?

Future versions of SEER*Prep (starting with Version 3) will allow you complete flexibility in defining your own file formats, including which variables are linked with the populations. Version 3 is currently in development, but a release schedule is not available. Until then, SEER*Prep needs to be "tricked" a bit to use the ZIP code variable.

Recode your ZIP codes to 3-digit codes and place them in the county columns of both your case and population data. Use the codes 000, 001, 002, etc. This assumes you have a maximum of 1000 ZIP codes in your state or registry. You cannot rename the variable in SEER*Prep; it will still be known as county.

Use SEER*Prep to create the database. In SEER*Stat, create a user-defined variable called "ZIP code" based on county. When creating the ZIP code user-defined variable, create appropriate labels. For example, create a grouping with a value of "000" and rename it "20904". Use the new ZIP code variable in future SEER*Stat sessions.

Why do I get invalid values in the ICCC site recode variables?

Unlike the site recode variables, there is not an individual ICCC site recode that is created from both ICD-O-2 and ICD-O-3. All tumors in your database must be coded in one or the other in order to get a recode for all tumors. Use the ICD Conversion programs to convert your data from ICD-O-3 to 2 or from ICD-O-2 to 3.

SEER started using ICCC definitions based on ICD-O-3 with the November 2005 data submission.

My data is in a format no longer supported by SEER*Prep. What can I do?

With the release of SEER*Prep 2.4.0, we are no longer distributing NAACCR version 9 or 10 file formats, and support for SEER 250 has been eliminated.

  • If you are currently using NAACCR 10.1, the switch to NAACCR 11.3 should be relatively easy. The new format simply has more fields for derived variables, but is otherwise the same as 10.1.
  • If you are currently using the NAACCR 9 or SEER 250 format, you have no reason to upgrade to a newer version of SEER*Prep. Version 2.3.5 is the last version that supported those formats.

What is the difference between the NAACCR format and the Global format?

These are both Database Description files for incidence data. The NAACCR format is a 4048-byte length file that includes all variables currently collected by most population-based cancer registries operating in the US. The Global format is a much shorter file (234-byte length) that includes only the core variables necessary to analyze incidence and survival data, plus 50 user-specified variables with varying length. Registries operating abroad might find the Global format more convenient.

What is the difference between the two variables "Behavior Recode for Analysis" and "Behavior Recode for Analysis Derived"?

The Behavior Recode for Analysis variable is used for analysis of SEER data to account for tumors that are either no longer reportable to SEER, or newly reportable, based on the change from the ICD-O-2 to ICD-O-3 coding scheme on January 1, 2001. SEER*Prep does not create this variable, but has a column location for it in the incidence file format. To take advantage of this, you must create the variable on your own. See SEER Behavior Recode for Analysis for more information.

The Behavior Recode for Analysis Derived variable was made available with the release of SEER*Prep 2.4.0 as an option for people who are using their own non-SEER data. It has one value fewer than the original behavior recode: cases coded as "5=No longer reportable in ICD-O-3" would be "1=Borderline malignancy" in the derived variable. SEER*Prep will create this variable from primary site and ICD-O-3 behavior and histology. See Behavior Recode for Analysis Derived for more information.