* indicates required field
This database is available only to U.S. based researchers affiliated with U.S. institutions.
SEER registries in Georgia (GA) and California (CA) linked all incident tumor cases diagnosed from 2013 to 2019 to genetic test results performed from 2012 to 2021. Results were provided by the three genetic testing laboratories (Ambry Genetics, Invitae/Labcorp Genetics, and Myriad Genetics) that conduct most of the genetic testing in the two states. This pilot project was conducted under Institutional Review Board (IRB) approved protocol at each registry. Of 1,584,923 cancer patients in the registry cohort, 9.1% linked to genetic test results. Approximate counts of linked tumors by primary site are provided in Tumor Counts Linked to Genetic Test Data (XLSX).
Access Requirements
- You must already have access to the latest SEER Research Plus Data before a specialized data request can be submitted. Refer to How to Request Data Access for more information.
- Provide purpose and analytical plan in your request.
- Each request will first be reviewed by NCI SEER staff for provisional approval.
- Provisionally approved request is then required to be reviewed by the National Cancer Institute's central Institutional Review Board (cIRB). The data requestor will receive detailed instructions how to apply to cIRB in an email notification of provisionally approved proposal.
- Depending on which version of the genetic data file is requested, an approval of California Cancer Registry (CCR) IRB may be required.
Accessing the Data through VCDAS
After final approval, the data will be distributed and accessed through the Virtual Cancer Data Access System (VCDAS). VCDAS is a data enclave environment that provides secure remote access to perform statistical analyses without the need or ability to download the data locally. Read more about VCDAS and what to expect after your data request is approved.
Database Details
There are two Genetic Testing Linkage Database versions available to request:
The first version has only year of sample accession date and report date. The release of this version requires only NCI cIRB review. The second version has month and year of sample accession and report date. This version requires both California Cancer Registry (CCR) IRB review and NCI cIRB review.
For both version of the database, the available data consists of two files. Both files contain a field genelinkID, which is a masked patient ID that can be used to link across the two files.
- 
            GA-CA SEER Research Plus (2000-2021 diagnosis years)
            - Includes the same fields as the SEER Research Plus databases.
- These data do not contain any geographic information (no county, registry, or state).
- The file contains 4 additional data fields associated with census tract attributes. These fields are the same as the fields in the Incidence Data with Census Tract Attributes Database.
 
- 
            Genetic test results data file which has the following fields:
            - Gene name (approximately 100 genes)
- 
                    Gene status (reported categorically):
                    - Normal
- Pathogenic variant
- Variant of unknown significance
 
- Accession date (month and year or only year)
- Report date (month and year or only year)
 
The genes are reported if they were tested by at least two of the laboratories. Genes tested by a single laboratory were collapsed in an "other" gene category. Download the data dictionary for the genetic data [XLSX].
Database Limitations
For some tumor sites the sample size may not be sufficient to support research questions. Please review the table in the link above before requesting data.
 An official website of the United States government
 An official website of the United States government
		 
    	