FAQs - SEER*Prep Software

How can I create a database that I can use to calculate age-adjusted rates?
In the derived variable Cause of Death Recode, how are the site groupings defined?
Can I use ICD causes of death coded at the 3-digit level when creating a mortality database?
Is there a way to move databases created by SEER*Prep?
How do I create a database if my data is in NAACCR XML format?
What is a .txd.gz file?
What does this SEER*Prep error message mean?
"ERROR: Line termination character not found in the first record of <filename> with record length <value>"
I have regions defined as groups of counties. How can I use SEER*Prep to create a geographic variable using groups of counties rather than individual counties?
My data is in a format no longer supported by SEER*Prep. What can I do?
What is the difference between the NAACCR format and the Global format?
What is the difference between the two variables "Behavior Recode for Analysis" and "Behavior Recode for Analysis Derived"?

1. How can I create a database that I can use to calculate age-adjusted rates?

The steps for creating a database are described in How to use SEER*Prep.

2. In the derived variable Cause of Death Recode, how are the site groupings defined?

SEER*Prep uses the 1969+ definitions presented with the SEER Cause of Death Recode.

3. Can I use ICD causes of death coded at the 3-digit level when creating a mortality database?

ICD causes of death coded at the 3-digit level can be used for both ICD-9 and ICD-10 codes. SEER*Prep uses these to create a recode (SEER Cause of Death Recode [3-digit]).

4. Is there a way to move databases created by SEER*Prep?

You can move databases created by SEER*Prep. However, it is more efficient and less error-prone to create databases on a local area network initially. Then SEER*Stat users should create a local Data Location with the shared path.

To move a database, you must first check to see where you created the database. SEER*Prep creates databases in the user-defined data directory. Open the File menu and select Preferences to determine that path.

Exit SEER*Prep. Use Windows Explorer to copy the entire user-defined data directory to another location. You must move the entire directory and all of its contents, but you will be able to remove unwanted databases from the new location later.

Each person wishing to use the database will need to have its location set up as a local Data Location in SEER*Stat. Instruct them to select Preferences from SEER*Stat's File menu, then click Add Local to create a new Data Location with that path. The databases will appear when a new session is created or opened in SEER*Stat.

Use the Database Management feature to delete any databases that you do not want in the new location. Proceed as follows to make sure you delete from the new location, not the old. First, start SEER*Prep. Next, set the Directory to Save User-Defined Databases to the new location. Finally, use the Database Management command on the File menu to delete the unwanted databases.

5. How do I create a database if my data is in NAACCR XML format?

XML files will be supported by a future version of SEER*Prep, but are not supported by version 3.x. In the meantime, there are several options for converting NAACCR XML to CSV:

Refer to Example 1 in How to Use SEER*Prep for instructions using SAS, or
File*Pro software can also convert to CSV.

6. What is a .txd.gz file?

A .txd.gz file is a compressed text file created by the Gzip compression utility.

In SEER*Prep, all input data other than case files for NAACCR 22 format must be stored in either fixed-width text or compressed fixed-width text files. If the input file is a fixed-width text text file then it must be named with a .txd extension. Use a compressed format if you wish to reduce the disk space required to store your input data. Gzip , a free utility, creates files using the only compression format supported by SEER*Prep. SEER*Prep requires gzipped data files to have a .txd.gz extension.

7. What does this SEER*Prep error message mean?
"ERROR: Line termination character not found in the first record of <filename> with record length <value>."

This error will occur with fixed-width input formats if your data file does not contain fixed-length records. Input File Format provides a list of the record lengths required for the supported file formats.

If your data files are not fixed-length or the records are not the correct length, you must modify the files before using SEER*Prep. See if your data management software has an option to output fixed-length records or use one of the "fix length" programs listed in SEER*Prep Utilities. All records in an input file must be the same length.

8. I have regions defined as groups of counties. How can I use SEER*Prep to create a geographic variable using groups of counties rather than individual counties?

No changes are needed when running SEER*Prep; this can be handled in SEER*Stat. Use SEER*Prep to create the database using county as a population variable. In SEER*Stat, create a user-defined variable based on county to define your geographic regions. Use the new user-defined variable in your analyses.

9. My data is in a format no longer supported by SEER*Prep. What can I do?

With the release of SEER*Prep 3.0.0, we are no longer distributing NAACCR version 9-17 file formats. Version 3.0.0 still supports the older dd files, but they are not distributed with the software. Version 18 fixed-width NAACCR dd files are still included, however, working with CSV files offers more flexibility and is the recommended format for NAACCR input data. The fixed-width NAACCR 18 dd files will not be updated for future NAACCR formats.

10. What is the difference between the NAACCR format and the Global format?

These are both Database Description files for incidence data. The NAACCR format is a CSV file format that includes all variables currently collected by most population-based cancer registries operating in the U.S. The Global format is a much shorter file (334-byte length) that includes only the core variables necessary to analyze incidence and survival data, plus 50 user-specified variables with varying length. Registries operating abroad might find the Global format more convenient.

11. What is the difference between the two variables "Behavior Recode for Analysis" and "Behavior Recode for Analysis Derived"?

The Behavior Recode for Analysis variable is used for analysis of SEER data to account for tumors that are either no longer reportable to SEER, or newly reportable, based on the change from the ICD-O-2 to ICD-O-3 coding scheme on January 1, 2001. SEER*Prep does not create this variable, but has a column location for it in the incidence file format. To take advantage of this, you must create the variable on your own. See SEER Behavior Recode for Analysis for more information.

The Behavior Recode for Analysis Derived variable was made available with the release of SEER*Prep 2.4.0 as an option for people who are using their own non-SEER data. It has one value fewer than the original behavior recode: cases coded as "5=No longer reportable in ICD-O-3" would be "1=Borderline malignancy" in the derived variable. SEER*Prep will create this variable from primary site and ICD-O-3 behavior and histology. See Behavior Recode for Analysis Derived for more information.

SEER*Prep Frequently Asked Questions