Derived HER2 Recode (2010+)

The HER2 summary result was not a required data item for all breast cancer cases diagnosed in 2010, even though most SEER registries collected, coded, and submitted such data to the NCI SEER program. As a result, approximately 7% of the breast cancer cases in the 18 SEER registries did not have any HER2 summary information, while the remaining cases had directly coded summary HER2 information provided by the registries. However, for the cases missing the summary information we do have information about the type of HER2 test each patient received and results associated with each test. For example:

  • Site-specific Factor (SSF) SSF9 contains information on whether the patient tested positive, negative, or borderline according to an Immunohistochemistry (IHC) test;
  • SSF11 contains results from a Fluorescence In Situ Hybridization (FISH) test;
  • SSF13 contains results from a Chromogenic In Situ hybridization (CISH) test, etc.

We used an algorithm to derive HER2 information for these cases using these test interpretation results. Before we applied the algorithm for the cases with missing summary information, we first applied it to cases for whom directly coded HER2 summary information was available to assess agreement of the derived results to the coded results. When we compared the results obtained using these two approaches, we found agreement to be more than 97%.

Algorithm for Derived HER2 Summary Variable

  1. First we assessed the test interpretation variables such as SSF9 (IHC), SSF11 (FISH), SSF13 (CISH), SSF14 (Other) for deriving HER2 summary information.
  2. We then created a composite variable by looking at FISH/CISH/OTHER test interpretation results (SSF11/SSF13/SSF14). The composite or recoded variable is called FCO_comb with the idea being:
    • if any of these tests had positive results, recode variable (i.e., FCO_comb) would be coded as positive;
    • after elimination of positive, for the remaining cases, negative results from any tests would be coded as negative in the FCO_comb variable;
    • after elimination of positive and negative, for the remaining cases, borderline from any tests would be coded as borderline in FCO_comb; and
    • finally any remaining cases without a designation of positive, negative, or borderline, would be treated as unknown in FCO_comb.
  3. As a final step, in creating the derived HER2 variable, we used the FCO_comb and SSF9 (IHC) with the following rules in coding the derived HER2 Summary variable. Note these rules are used in the following priority order:
    • Rule 1
      If FCO_comb = positive then derived her2= positive (regardless of what IHC test results indicate)
      If FCO_comb = negative then derived her2= negative (regardless of what IHC test results indicate)
    • Rule 2
      If FCO_comb = borderline and IHC =positive then derived her2= positive;
      If FCO_comb = borderline and IHC =negative then derived her2=negative;
      If FCO_comb = borderline and IHC = borderline then derived her2= borderline;
      If FCO_comb = borderline and IHC = unknown then derived her2= borderline;
    • Rule 3
      If FCO_comb = unknown and IHC =positive then derived her2= positive;
      If FCO_comb = unknown and IHC =negative then derived her2=negative;
      if FCO_comb = unknown and IHC =borderline then derived her2=borderline;
    • Rule 4
      if FCO_comb = unknown and IHC =unknown then derived her2= unknown;