SEER 1973-2008 Research Data DVD This DVD includes the SEER*Stat software, the binary format of the data for use with SEER*Stat, and the ASCII data files that can be analyzed with your own software. The DVD contains the binary (SEER*Stat) version of the data in uncompressed format in the directory: \SEER_1973_2008_SEERSTAT - This directory contains the SEER*Stat software and the binary format of the data. See the Readme.txt file in that directory for more details. The size of this directory and contents is 2.1 gigabytes. The SEER*Stat software will perform much more efficiently when reading data from a local hard-drive rather than DVD. If space permits, it is recommended that you copy the data to a local hard-drive. You can copy the entire directory, or use the self-extracting zip file contained on the DVD: \SEER_1973_2008_SEERSTAT.d10282011.exe - 800 megabytes The ASCII text data files are only available in compressed format. The uncompressed version of this data is 2 gigabytes and will contain documentation files. It is available as both a Windows self-extracting zip file and as a zip file that does not require Windows: \SEER_1973_2008_TEXTDATA.d10282011.exe - 210 megabytes \SEER_1973_2008_TEXTDATA.d10282011.zip - 210 megabytes By default, the self-extracting zip files will extract the data to c:\SEER_1973_2008_SEERSTAT and c:\SEER_1973_2008_TEXTDATA. The directories can also be placed on a network drive, however, only those that have signed a 1973-2008 SEER Research Data Agreement can access the data. ---------------- REVISION HISTORY ---------------- 10/28/2011 1. The SEER*Stat databases and the ASCII text data files were updated to correct a problem with the fields NHIA Derived Hispanic Origin and Origin recode NHIA (Hispanic, Non-Hisp). This problem was limited to the following four registries: San Francisco-Oakland SMSA, San Jose-Monterey, Los Angeles, and California excluding SF/SJM/LA. Any analyses using either of these fields should be regenerated. 2. The SEER*Stat databases and the ASCII text data files were updated to remove information pertaining to scope of regional lymph node surgery. The change was for breast cancer cases only, defined strictly by primary site (C50.0-C50.9; represented as 500-509 in the files). For 2003-2008 breast cancer cases, RX Summ--Scope Reg LN Sur (2003+) has been changed to 9 (unknown/not applicable). For 1998-2002 cases, the same change was made to Scope of reg lymph nd surg (1998-2002). For additional information on this change, please refer to: http://seer.cancer.gov/seerstat/variables/seer/regional_ln 10/05/2011 1. New version of SEER*Stat software (7.0.5). See \SEER_1973_2008_SEERSTAT\Readme.txt for details. 2. The SEER*Stat databases and the ASCII text data files were updated to correct a problem with the fields Derived SS2000 (2004+) and Summary stage 2000 (1998+). This problem was limited to 2004-2008 cases from the following six registries: San Francisco-Oakland SMSA, Atlanta (Metropolitan), San Jose-Monterey, Los Angeles, Rural Georgia, and California excluding SF/SJM/LA. Any analyses using either of these fields should be regenerated. 3. The SEER*Stat databases and the ASCII text data files were updated to include information in CS site-specific factor 1 (2004+) for Ovary and Intracranial Gland and CS site-specific factor 2 (2004+) for Prostate. 4. The SEER*Stat databases were updated to show all valid values for the variable CS Schema v0202. This was not a data change, just a change to the format. This change was meant to document what the valid values are for the field regardless of the values that actually occur in the database. 6/03/2011 - Data change for binary (SEER*Stat) version Fixed a problem with SEER 17 databases in rate and prevalence sessions. The problem only occurred if the "Cases in Research Database" checkbox was unchecked (it is checked by default). In this case, if the results were stratified by SEER registry, or State-county, the session would generate an error with no results. If the results were not stratified by SEER registry or State-county, the populations and rates/prevalence proportions would be incorrect. 5/12/2011 1. The SEER*Stat databases and the ASCII text data files were updated to include Gleasons score information in CS site-specific factor 5 (2004+) and CS site-specific factor 6 (2004+) for prostate cancer. 2. The ASCII text files had two additional changes which effect the columns of many variables. The fields CS Extension and CS Lymph Node were corrected to be 3 digits as per Collaborative Stage v2. The sample SAS input statement and SEERDic.pdf file were updated accordingly. 5/03/2011 - The ASCII text files were corrected to include the field SEER cause-specific death classification (column 270). 4/27/2011 - Initial release.