An official website of the United States government
Skip Navigation

Theil Index (T)


The Theil Index (T), developed by economist Henri Theil (15), measures general disproportionality.  The measure T can be estimated as: 

$$\widehat{T} = {\sum}_{j=1}^{J}p_{j}Z_{j}\ln(Z_j),$$

where \(p_{j}\) is the population share of the jth socio-economic group; \(Z_{j}=\frac{y_{j}}{\mu}\) is the ratio of the health status of the jth socioeconomic group, \(y_{j}\), relative to the average health status of the population μ, \(\mu={\sum}_{j=1}^{J}p_{j}y_{j}.\)

The measure \(T\) is population-weighted and is more sensitive to health differences further from the average rate (by using the logarithm) and may be used for both ordered socioeconomic groups (e.g., education) and unordered social groups (e.g., gender, race).

Variance of \(\widehat{T}\)

Variance of \(\widehat{T} \) based on Taylor Series Linearization method is estimated as: 

$$\mathrm{var}_{TL}(\widehat{T})= \frac{1}{\mu^{2}}{\sum}_{j=1}^{J}\hat{\sigma}_{j}^{2}p_{j}^{2}[1+\ln(Z_{j})-S]^{2},$$

where \(S={\sum}_{j=1}^{J}p_{j}Z_{j}[1+\ln(Z_{j})]\).  The standard error of \(\widehat{T}\) based on the Taylor Series Linearization method is:  \(\sqrt{\mathrm{var}_{TL}(\widehat{T})}\).  See Ahn et al 2018 (4) for details on how \(\mathrm{var} _{TL}(\widehat{T})\) was derived.

Variance of \(\widehat{T}\) based on Monte Carlo Simulation Method

Randomly generate M age-adjusted rates \(y_{j}^{(m)},m=1,...,M,\) using the distribution:  \(y_j^{(m)} \sim Gamma(mean=y_j,var=\hat{\sigma}^2_j)\) for each socioeconomic group.  Calculate \(\widehat{T}^{(m)}\) using \(y_{j}^{(m)}\), and the variance of \(\widehat{T}\) is: 


where \(\overline{\widehat{T}}=M^{-1}{\sum}_{m=1}^{M}\widehat{T}^{(m)}\).  \(M=1,000\) is used in HD*Calc.  Gamma distribution instead of truncated normal distribution was recommended to simulate age-adjusted rates (1).  The standard error of \(\widehat{T}\) based on the MCS method is:  \(\sqrt{\mathrm{var}_{MCS}(\widehat{T})}\).

95% Confidence Interval for \(\widehat{T}\)

The 95% confidence interval of T based on the Taylor Series Linearization method is: 

$$\widehat{T} \pm 1.96 \times \sqrt{\mathrm{var}_{TL}(\widehat{T})}.$$

The lower and upper bounds of the 95% confidence interval of \(\widehat{T}\) based on the MCS approach are the 2.5th percentile and the 97.5th percentile of the \(1,000\widehat{T}^{(m)}\) values.