Measuring Health Disparities By Socio-Economic Status using Obesity Data in Individual Records from National Health And Nutrition Examination Survey (NHANES)
In this exercise, we will calculate several measures of health disparity by socio-economic status using obesity data from National Health and Nutrition Examination Survey (NHANES). For this exercise you will not be using SEER*Stat.
This example calculates socio-economic disparities in the prevalence of obesity in children and adolescents (age 2 YEARS - 19 YEARS) using the 2015-2016 NHANES sample data. The tutorials from the NHANES program provide generic instructions of downloading and merging data using SAS or STATA - see the sections Download Data Files and Merge NHANES Data at https://wwwn.cdc.gov/nchs/nhanes/tutorials/datasets.aspx. Step A provides specific instructions to extract and create the NHANES obesity survey sample, which is then imported into HD*Calc for calculating summary measures of health disparity.
Step A: Create an export file from NHANES data
- Download the demographic data from https://wwwn.cdc.gov/nchs/nhanes/search/datapage.aspx?Component=Demographics&CycleBeginYear=2015. Read SAS Transport File and create the SAS dataset that contains demographic variables and survey design variables. The questionnaire and codebook of each variable involved can be found by clicking the links below.
SEQN - Respondent sequence number
WTMEC2YR - Full sample 2 year MEC exam weight
SDMVPSU - Masked variance pseudo-PSU
SDMVSTRA - Masked variance pseudo-stratum
INDFMPIR - Ratio of family income to poverty
RIDAGEYR - Age in years at screening
***************************************************************
* Read DEMOGRAPHICS SAS Transport File and Create SAS Dataset**
**************************************************************;
libname NH "C:\NHANES\DATA";
libname XP xport "C:\NHANES\TEMP\DEMO_I.XPT";
/*please remember to change your library to the directory location you downloaded your files*/
proc copy in=XP out=NH;
keep SEQN WTMEC2YR SDMVPSU SDMVSTRA RIDAGEYR INDFMPIR;
run;
- Download the Examination-BMI data from https://wwwn.cdc.gov/nchs/nhanes/search/datapage.aspx?Component=Examination&CycleBeginYear=2015, read SAS Transport File and create the SAS dataset that contains demographic variables and survey design variables. The questionnaire and codebook of each variable involved can be found by clicking the links below.
SEQN - Respondent sequence number
BMDBMIC - BMI Category - Children/Youth
***************************************************************
* Read EXAMINATION SAS Transport File and Create SAS Dataset**
**************************************************************;
libname NH "C:\NHANES\DATA";
libname XP xport "C:\NHANES\TEMP\BMX_I.XPT";
/*please remember to change your library to the directory location you downloaded your files*/
proc copy in=XP out=NH;
keep SEQN BMDBMIC;
run;
- Merge demographic data from Step 1 and examination data from Step 2 to create the analytic data
***************************************************************
* MERGE DEMOGRAPHICS and EXAMINATION SAS Datasets **
* SELECT PARTICIPANTS AGED 2 YEARS TO 19 YEARS **
* RECODE
**************************************************************;
libname NH "C:\NHANES\DATA";
data NH.NHANES_OBESITY;
merge NH.DEMO_I NH.BMX_I;
by SEQN;
run;
data NH.NHANES_OBESITY;
set NH.NHANES_OBESITY;
if RIDAGEYR>=2 and RIDAGEYR<=19;
/* Create OBESITY variable. */
if BMDBMIC=4 then OBESITY=1;
else if BMDBMIC in (1,2,3) then OBESITY=0;
else OBESITY=9;
/* Create SES variable. */
if INDFMPIR <= 1 then SES = 1;
else if INDFMPIR > 1 and INDFMPIR <= 2 then SES = 2;
else if INDFMPIR > 2 and INDFMPIR <= 4 then SES = 3;
else if INDFMPIR > 4 then SES = 4;
run;
proc export data=NH.NHANES_OBESITY
outfile=’C:\NHANES\DATA\NHANES_OBESITY.csv’ dbms=CSV replace;
run;
The resulting input data should look like this sample file: NHANES_OBESITY.csv (CSV, 3 KB). The following fields are included in each record:
- SES
- OBESITY
- WTMEC2YR
- SDSMVSTRA
- SDMVPSU
Step B: Import the Data into the HD*Calc Program
- When you start the HD*Calc application you will get a message that reminds users to open a data file in order to view the disparity measures.
- Select Open... from the File Menu to open your data file. In the Select Health Disparities Input File dialog, change the File Type at the bottom to Import Text File (*.csv). Find the same file that you created in Step A above.
- When your file is opened, you will be taken to the HD*Calc Data Import dialog where you will provide all the information needed to identify the fields in your input file. In the edit box at the top please provide a Title for your input data. This title will be displayed with the resulting disparity measures.
- Use the Dictionary edit box to select a file for storing your data input specifications.
- The checkbox indicating that your Data File Contains Column Headers should be checked.
- The Fields Are Delimited radio button should be selected, and Comma should be selected as the Delimiter.
- For the Data Type check the radio button for Individual Level Data records. This lets HD*Calc know that you are importing survey data and not pre-calculated statistics from SEER*Stat.
- You must now specify a Field Type for each field in the List of Fields. To do this, select each Field individually (one field at a time) and press the Change button to the right. You will be taken to the Edit Format dialog:
Select a Field Type from the dropdown at the top:- 1=SES Group 1
- 2=SES Group 2
- 3=SES Group 3
- 4=SES Group 4
- 1=Positive Outcome
- 9 should be added to the excluded list
- (all other values are interpreted as negative)
- SES is Disparity Variable
For SES, you must specify the format values and enter them into the text box: - SDMVPSU is the Primary Sampling Unit statistic
- SDMVSTRA is the Stratum statistic
- WTMEC2YR is the Sampling Weight statistic
- OBESITY is the Outcome Variable
For OBESITY, Analyze As must be set to Binary: - For all the others, select Ignored as your field type
- When all your file fields have been defined, press OK. You will then be taken to the main Disparity Measures dialog where your results will be displayed (Step C).
Step C: View Disparity Measures In HD*Calc
- When the Disparity Measures (results) dialog opens you will be asked to specify whether the disparity groups in your data are ranked (e.g. by income or education). There are some disparity measures that will only be presented if the groups are ranked. Since this example uses Socio-Economic Status as the basis for the disparity groups, there is an inherent ranking, so press Yes as your response.
- On the Disparity Groups tab, in the Ranking Disparity Groups box, see that the groups in your file are ranked and that the least advantaged group is first. The checkbox below that indicating that a higher percentage obese means Less Healthy, should be checked.
- On the Disparity Table tab you will see all the measures that have been calculated for your data. If you click on the title of a disparity measure, the help system will display a description of that measure.
- The Data Table tab shows some additional statistics calculated from your data for use in the computation of disparity measures.
- The Pair Comparison tabs allow you to select disparity groups to be compared. For each pair of groups, the Rate Difference and Rate Ratio are calculated and presented on the table.