How To Create An Index In Stata

Stata J. Author manuscript; available in PMC 2016 Apr 4.

Published in final edited form as:

Stata J. 2016 1st Quarter; 16(1): 112–138.

PMCID: PMC4819995

NIHMSID: NIHMS756115

conindex: Estimation of concentration indices

Owen O'Donnell

Erasmus School of Economics Erasmus University Rotterdam, the Netherlands Tinbergen Institute, the Netherlands and University of Macedonia, Greece

Stephen O'Neill

Department of Health Services Research and Policy-London School of Hygiene and Tropical Medicine, UK ku.ca.inthsl@llieno.nehpets

Tom Van Ourti

Erasmus School of Economics Erasmus University Rotterdam, the Netherlands and Tinbergen Institute, the Netherlands

Brendan Walsh

Division of Health Services Research and Management School of Health Sciences and City Health Economics Centre City University London, UK

See other articles in PMC that cite the published article.

Abstract

Concentration indices are frequently used to measure inequality in one variable over the distribution of another. Most commonly, they are applied to the measurement of socioeconomic-related inequality in health. We introduce a user-written Stata command conindex which provides point estimates and standard errors of a range of concentration indices. The command also graphs concentration curves (and Lorenz curves) and performs statistical inference for the comparison of inequality between groups. The article offers an accessible introduction to the various concentration indices that have been proposed to suit different measurement scales and ethical responses to inequality. The command's capabilities and syntax are demonstrated through analysis of wealth-related inequality in health and healthcare in Cambodia.

Keywords: inequality, rank-dependent indices, concentration index, health, healthcare

1 Introduction

Concentration indices measure inequality in one variable over the distribution of another (Kakwani, 1977). They are a particularly popular choice for the measurement of socioeconomic-related health inequality (Wagstaff et al., 1991; O'Donnell et al., 2008), as is evident from the 9,220 entries in Google Scholar with the keywords 'concentration index' and 'health'. In that case, the concentration index captures the extent to which health differs across individuals ranked by some indicator of socioeconomic status. A variety of concentration indices have been proposed to suit the measurement properties of the variable in which inequality is to be assessed and the assessor's ethical response to inequality (Wagstaff et al., 1991; Wagstaff, 2002; Wagstaff, 2005; Erreygers, 2009a; Erreygers and Van Ourti, 2011; Erreygers et al., 2012).

This article introduces the Stata command conindex which provides a simple unified means by which to estimate various concentration indices and their standard errors. It can be used to graph the concentration curves that underlie some of the indices and to test for differences in inequality across groups. It is designed to measure cross-sectional inequality in a cardinal variable over observations ranked by another variable that is at least ordinally measured. With repeated cross-section or panel data, one can use the command to compare inequality across periods. The command can also be used to estimate rank-dependent indices of univariate inequality, such as the Gini and generalized Gini.

Other user-written Stata commands are available to calculate some rank-dependent inequality indices, concindc (Chen, 2007) computes the most standard version of the concentration index for both individual and grouped data. The Lorenz curve and associated indices of univariate inequality can be computed and decomposed with a range of commands (glcurve, ineqerr, ineqdeco, descogini) (Jenkins and van Kerm, 2007; Jolliffe and Krushelnytskyy, 1999; Jenkins, 2010; Lopez-Feldman, 2008). The comparative advantage of conindex is that it estimates a battery of concentration indices allowing the analyst to select an index that is appropriate given the measurement properties of the variable of interest and is consistent with their normative principles concerning inequality. The indices are estimated by making use of the correspondence of each to a transformation of the covariance between the variable in which inequality is measured and rank in the distribution over which inequality is assessed. This so-called 'convenient covariance' approach (Kakwani, 1980; Jenkins, 1988; Kakwani et al, 1997) can be implemented with both individual and grouped data while taking account of the sample design.

Before explaining the command, we define the various inequality indices it can compute and offer some guidance regarding the context in which each index is statable. The features of the command are illustrated through the analysis of wealth-related inequality in health and health care in Cambodia.

2 Standard and generalized concentration indices

The concentration curve is the bivariate analogue of the Lorenz curve. It plots the cumulative proportion of one variable against the cumulative proportion of the population ranked by another variable. To facilitate more concise exposition, we will mostly refer to the variable of interest as health and the ranking variable as income. Income-related health inequality can be assessed by plotting the cumulative proportion of health across individuals ranked from poorest to richest. Unlike the Lorenz curve, the concentration curve may lie above the 45° line if health, or more likely a measure of ill-health, is more heavily concentrated amongst those with lower incomes (as in the hypothetical example in Figure 1). The concentration index is twice the area between the concentration curve and the 45° line indicating no relationship between the two variables.^¹ It is defined as:

$C (h ∣ y) = \frac{2 cov (h_{i}, R_{i})}{\bar{h}} = \frac{1}{n} \sum_{i = 1}^{n} [\frac{h_{i}}{\bar{h}} (2 R_{i} - 1)]$

(1)

where h_i is the health variable in which inequality is measured e.g. health.^²

An external file that holds a picture, illustration, etc. Object name is nihms-756115-f0001.jpg

Hypothetical Concentration Curve

C ranges from $\frac{1 - n}{n}$ (maximal pro-poor inequality i.e. all health is concentrated on the poorest individual) to $\frac{n - 1}{n}$ (maximal pro-rich inequality).^³

Inspection of equation (1) reveals that the concentration index can be interpreted as a weighted mean of (health) shares with the weights depending on the fractional (income) rank (2R_i – 1).^⁴ The Gini coefficient measure of univariate inequality arises as a special case of the concentration index when inequality is measured in the same variable that is used for ranking. This is true for all indices discussed in the remainder of this text, implying that the command introduced in this article can be used to estimate univariate inequality.

The concentration index measures relative inequality. It is invariant to equi-proportionate changes in the variable of interest (health). This relative invariance is a polar case of the many normative positions one might take in measuring inequality (Kolm, 1976). At the other extreme, absolute invariance corresponds to an inequality measure that is invariant to equal additions to health. Such a measure can be obtained through multiplication of the standard concentration index by the mean health leading to the generalized concentration index (Wagstaff et al., 1991).^⁵ Multiplication by the mean gives this parameter an important role in the assessment of absolute inequality. When two distributions display the same level of relative inequality, the one with the higher mean will correspond to greater absolute inequality. The generalized concentration index GC can be expressed as:

$GC (h ∣ y) = \frac{1}{n} \sum_{i = 1}^{n} [h_{i} (2 R_{i} - 1)]$

(2)

and ranges between $\bar{h} (\frac{1 - n}{n})$ (maximal pro-poor) and $\bar{h} (\frac{n - 1}{n})$ (maximal pro-rich).

3 Taking account of measurement scale

The standard and generalized concentration indices are not necessarily invariant, or equivariant, under transformations of the variable of interest that are permissible for the level of measurement (i.e. nominal, ordinal, cardinal, ratio or fixed scale) (Erreygers and Van Ourti, 2011).^⁶ A number of variants of the standard and generalized concentration indices have been proposed for use with variables possessing different measurement properties. We differentiate between measurement levels at which permitted transformations affect the value of an index, and levels at which transformations to different scales affect inequality orderings. Both have received attention (e.g. Lambert and Zheng, 2011), but most applications focus on the former. The latter issue is, in our opinion, more important as it deals with whether one bivariate distribution is evaluated to display greater inequality than another, irrespective of an arbitrary scaling.

3.1 Measurement level

In bivariate inequality measurement, an ordinal scale is sufficient for the variable that is used for the ranking of individuals. Rank-dependent indices can then be deployed to quantify inequality in variables measured at three levels: ^⁷

Fixed: the measurement seale is unique with zero corresponding to a situation of complete absence e.g. number of visits to a hospital within a given period.
Ratio: the measurement scale is unique up to a proportional scaling factor with the zero point corresponding to a situation of complete absence e.g. life expectancy that could be measured in years, months etc.
Cardinal: the scale is such that differences between values are meaningful but ratios are not and the zero point is fixed arbitrarily e.g. temperature in Celsius or Fahrenheit, a (health) utility index.

For variables on a fixed scale, the standard and generalized concentration indices quantify inequality in the attribute of fundamental interest. Both are appropriate, with the choice between them depending on whether one is concerned about relative or absolute inequality. Changing the proportionality factor of a ratio-scaled variable will affect the value of the generalized concentration index, but not that of the standard concentration index.^⁸ The generalized concentration index should therefore be used with ratio-scaled data only when the variables compared in an inequality ordering are subject to the same scaling factor.^⁹ Only in this case can one be sure that the inequality ordering given by the index applied to the variable is informative of the ranking of populations by inequality in the attribute of essential interest. Alternatively, since the generalized concentration index is equivariant under a proportional transformation of the variable, if the differential scaling factors are known, then they can be used ex-post to make the indices comparable across populations.

When the variable of interest is cardinal, the standard concentration index is not necessarily invariant to arbitrary retransformations of the variable.^¹⁰ This can be addressed by using the modified concentration index (Erreygers and Van Ourti, 2011).

$MC (h ∣ y) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{h_{i}}{\bar{h} - h^{\min}} (2 R_{i} - 1)]$

(3)

where h^min is the lower limit of h_i and the index ranges between $\frac{1 - n}{n}$ and $\frac{n - 1}{n}$ . Under ratio or fixed measurement scales (h^min = 0), equation (3) simplifies to the standard concentration index in equation (1).

The modified concentration index should be used when comparing inequality in an attribute using a variable that is inconsistently cardinalised for different populations, or when comparing inequality in different cardinally scaled variables in the same population. If the cardinalisation is constant, then inequality orderings made using the standard concentration index will be robust to the chosen cardinalisation (although the index values will depend on the specific cardinalisation chosen). Nevertheless, we advise to use the modified index also in this case, as it allows for an easier interpretation - the range is always $[\frac{1 - n}{n}, \frac{n - 1}{n}]$ .

There is no easy modification to ensure the invariance of the generalized concentration index to retransformations of cardinal variables. However, provided the cardinalisation adopted across populations or variables is the same, then the inequality ordering will be robust to the chosen cardinalisation.

3.2 Bounded variables

Variables with a finite upper limit, such as years in school, a (health) utility index or any-binary indicator, complicate the measurement of inequality.^¹¹ For instance, bounded variables can be represented either as attainments a_i ∈ [a^min , a^max ] or as shortfalls from the upper limit s_i = a^max − a_i . Erreygers (2009a) introduced the 'mirror' property that requires that the magnitude of measured inequality represented by the absolute value of an index should not depend on whether the index is computed over attainments or shortfalls, i.e. I (a) = −I(s).^¹²

The standard concentration index does not satisfy this condition: $C (s) = - \frac{\bar{a}}{\bar{s}} C (a)$ and hence inequality in attainments do not mirror inequality in shortfalls except when $\bar{a} = \bar{s}$ (Erreygers, 2009a).^¹³ Moreover, inequality orderings based on the standard concentration index might depend on whether one uses shortfalls or attainments. More generally, the mirror condition is incompatible with the measurement of relative inequality (Erreygers and Van Ourti, 2011; Lambert and Zheng, 2011). One must choose between satisfaction of the mirror condition and satisfaction of relative inequality invariance.^¹⁴

The generalized concentration index satisfies the mirror condition, GC (s) = −GC (a). However, as noted in section 2, the value of this index is not invariant to permissible transformations of ratio-scaled and cardinal variables. Erreygers (2009a) proposed a modification of the generalized concentration index that corrects this deficiency:

$E (a ∣ y) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{4 a_{i}}{(a^{\max} - a^{\min})} (2 R_{i} - 1)] = - E (s ∣ y)$

(4)

This index ranges between −1 and +1.

Wagstaff (2005) noted that the range of the standard concentration index depends on the mean of the bounded variable and suggested rescaling the standard concentration index to ensure that it always lies in the range [−1, 1]:^¹⁵

$W (a ∣ y) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{(a^{\max} - a^{\min}) a_{i}}{(a^{\max} - \bar{a}) (\bar{a} - a^{\min})} (2 R_{i} - 1)] = - W (s ∣ y)$

(5)

This index satisfies the mirror condition and so cannot be in line with the relative invariance criterion. Neither does it satisfy an absolute invariance criterion. In fact, the index is consistent with an inequality invariance condition consisting of a mixture of invariance with respect to proportionate changes in (i) attainments a_i and (ii) shortfalls s_i . This may be considered to have paradoxical implications (Erreygers and Van Ourti, 2011) although Kjellsson and Gerdtham (2013a) argue that this invariance criterion is actually intuitive when one realizes that W can be written as the difference between the standard concentration indices for attainments and shortfalls.

Unlike for unbounded variables, the precise scaling of bounded variables does not affect the value of any rank-dependent inequality index provided that the bounding is taken into account. This is most easily understood from the realization that any bounded variable can be retransformed into an indicator of the proportional deviation from the minimum value, $b_{i} = \frac{a_{i} - a^{\min}}{a^{\max} - a^{\min}}$ . This lies on the range [0, 1] and records only 'real' changes in the underlying attribute, not 'nominal' ones due to the choice of measurement scale. Under this transformation, the Erreygers and Wagstaff indices simplify respectively to $E (b ∣ y) = \frac{1}{n} \sum_{i = 1}^{n} [4 b_{i} (2 R_{i} - 1)]$ and $W (b ∣ y) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{b_{i}}{(1 - \overset{‒}{b}) \overset{‒}{b}} (2 R_{i} - 1)]$ .

3.3 Summing up

The main message of this section is not that the most appropriate inequality index depends on the measurement properties of the variable of interest but that those properties partly determine the ethical choices one faces when quantifying inequality, which is intrinsically a normative exercise. When the variable of interest has an infinite upper bound on a fixed scale, the main normative choice is between absolute and relative invariance. Matters are more complicated when the measurement scale is not unique. Applying the generalized concentration index to a ratio or cardinal variable requires one to accept that the inequality ordering may depend on the scaling adopted. This can be avoided for the relative inequality invariance criterion if one replaces the standard concentration index with the modified one. When the variable has a finite upper bound, one should first choose between relative inequality invariance and the mirror condition. If one prioritises the relative invariance criterion (in attainments or shortfalls), then the standard concentration index or its modified version can be used. When priority is given to the mirror condition, one faces a choice between the Erreygers index, which focuses on absolute differences, and the Wagstaff index, which mixes concern for relative inequalities in attainments and relative inequalities in shortfalls.

If one considers no index to be normatively superior to all others, then one can check whether the inequality orderings are consistent across indices. If they are, all well and good. However, such robustness does not hold in general.

4 Incorporating alternative attitudes to inequality

As noted in section 2, the standard concentration index can be interpreted as a weighted mean of the variable of interest with each individual's weight depending on its fractional rank, i.e. (2R_i – 1). This weight equals zero for individuals with the median value of the ranking variable^¹⁶, and is negative (positive) for individuals below (above) the median. Presuming the ranking variable is income, the weight increases linearly from $\frac{1 - n}{n}$ for the poorest individual to $\frac{n - 1}{n}$ for the richest. This linearity is consistent with a particular attitude towards inequality that need not command widespread support. Two extensions based on non-linear weighting schemes can represent a variety of alternative ethical positions. The first approach makes it possible to vary the weight put on those at the top relative to those at the bottom of the distribution of the ranking variable. We refer to it as 'sensitivity to poverty' as it allows more (or less) weight to be placed on the poorest individuals when income is used as the ranking variable. The second approach allows more (or less) weight to be placed on the extremes of the ranking distribution (e.g. the very rich and very poor) vis-a-vis those in the middle. We term this approach as 'sensitivity to extremity'.

4.1 Extended concentration index: 'sensitivity to poverty'

Kakwani (1980) and Yitzhaki (1983) proposed a flexible extension of the univariate Gini index that incorporates a distributional sensitivity parameter v specifying the attitude towards inequality within the weight defined by 1 − v(1 − R_i )^v−1. Pereira (1998) and Wagstaff (2002) suggested using the same weighting function in the context of the measurement of income-related health inequality. This results in an extended concentration index that is identical to the standard concentration index except for the weighting function:^¹⁷

$EC (h ∣ y; v) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{h_{i}}{\bar{h}} (1 - v {(1 - R_{i})}^{v - 1})]$

(6)

The distributional sensitivity parameter must take a value greater than or equal to 1. Larger values place more weight on the poorest individuals (when income is the ranking variable). The weighting function equals zero for v= 1 and in that case gives an index of 0 regardless of the distribution of h, while v= 2 yields the standard concentration index. The extended concentration index ranges between 1 – v and 1^¹⁸ which suggests an intuitive interpretation of v as the distance between the weight given to the poorest and the richest individual. The weight given to the richest individual is always +1, while the weight given to the poorest individual becomes more negative for higher values of v. Hence, the weighting function in equation 6 is asymmetric around the individual with median income, unless v= 2. For v≠ 2, the individual with median income does not have a weight of zero. The individual given a weight of zero will have a lower income than the one with the median income when v> 2.

4.2 Symmetric concentration index: 'sensitivity to extremity'

Erreygers et al. (2012) suggest to extend the linear weighting function of the concentration index in such a way that two conditions are satisfied: (i) the individual with median income should play a pivotal role, obtaining a weight of zero, and (ii) the weights for the other individuals should be inversely symmetric around median income. The poorest and richest individual should have the same weight, but with an opposite sign. The second poorest and second richest should have the same weight with opposite signs, and so on. Under these conditions, varying attitudes towards inequality express one's 'sensitivity to extremity', i.e. whether one is merely concerned with differences in the variable of interest at the middle of the income distribution, or rather with differences between the extremes of the income distribution.

An index that satisfies both conditions and allows for varying degrees of 'sensitivity to extremity' depending on the value of β > 1, which is analogous to the parameter v in the extended index, is:^¹⁹

$S (h ∣ y; β) = \frac{1}{n} \sum_{i = 1}^{n} (\frac{h_{i}}{\bar{h}}) [β 2^{β - 2} {({(R_{i} - \frac{1}{2})}^{2})}^{\frac{β - 2}{2}} (R_{i} - \frac{1}{2})]$

(7)

If 1 < β < 2, more weight is placed on the middle of the income distribution, while for β > 2 the extremes are weighted more at the expense of the middle. If β=2, the symmetric index equals the standard concentration index. When β becomes very large, the symmetric index will be very similar to the range index which is only sensitive to the difference between the upper and lower end of the income distribution. This corresponds to one of the earliest measures of health inequality (e.g. Townsend and Davidson, 1982). The range of the symmetric index is $[- \frac{β}{2}; + \frac{β}{2}]$ , which provides an intuitive interpretation of the β parameter as the absolute deviation between the weight given to the poorest and richest individual.^²⁰

The choice between the symmetric and extended indices is normative. The symmetric index gives equal weight (but with an opposite sign) to individuals that are equally far apart from the pivotal individual with median rank, while the extended index prioritizes the lower regions of the ranking (income) distribution. Applied to income-related health inequality, the symmetric index is increasingly sensitive to a change that raises the health of a richer individual and reduces that of a poorer individual by an equal magnitude the further those individuals are from the pivotal individual. In contrast, the extended concentration index will be increasingly sensitive the closer is the location of such a 'health transfer' to the bottom of the income distribution. Erreygers et al. (2012) argue that the symmetric index is more concerned about the association between income and health, while the extended concentration index puts priority on the income distribution, and only then analyzes health differences within the prioritized region of the income distribution.^²¹

4.3 Generalizing the extended and symmetric indices

Erreygers et al. (2012) consider counterparts of the extended and symmetric indices that satisfy the mirror condition. They refer to the resulting measures as generalized indices because they satisfy an absolute inequality invariance criterion and define these on the transformed bounded variable $b_{i} = \frac{a_{i} - a^{\min}}{a^{\max} - a^{\min}}$ :

$GEC (b ∣ y; v) = \frac{1}{n} \sum_{i = 1}^{n} (\frac{v^{\frac{v}{v - 1}}}{v - 1} b_{i}) [1 - v {(1 - R_{i})}^{v - 1}]$

(8)

$GS (b ∣ y; β) = \frac{1}{n} \sum_{i = 1}^{n} 4 b_{i} [β 2^{β - 2} {({(R_{i} - \frac{1}{2})}^{2})}^{\frac{β - 2}{2}} (R_{i} - \frac{1}{2})]$

(9)

with v ≥ 1 arid β > 1. For v= β = 2, both indices simplify to the Erreygers index (4). Erreygers et al. (2012) show that both indices always range between −1 and +1.^²²

5 Estimation and inference

Each of the rank-dependent inequality indices discussed above can be expressed as a transformation of the covariance between the variable of interest (h_i ) and the fractional rank (R_i ) of the ordering variable. For example, the standard concentration index is twice the covariance divided by the mean of the variable of interest (see equation (1)). Since the slope coefficient of a simple least squares regression is the covariance divided by the variance of the regressor, each inequality index can be obtained from a regression of a transformation of the variable of interest on the rank. For example, the standard concentration index is the least squares estimate of α ₁ in the model:

where $σ_{R}^{2}$ is the variance of R and ϵ_i is an error term. The standard error of the least squares estimate of α ₁ serves as a standard error of the estimate of the concentration index.^²³

An advantage of this approach is that Stata readily allows for sampling weights, as well as robust and clustered standard errors. Appropriate rescalings of the dependent variable lead to the other indices considered in section 3.^²⁴ Erreygers et al. (2012) do not provide standard errors for the extended and symmetric indices and therefore standard errors for these indices are not reported.

A final note concerns ties in the ranking variable which arise when different observations have the same value for the ranking variable, conindex accounts for this by calculating the fractional rank from the proportion of individuals with a given value of the ranking variable (y), such that $R_{i} = {[\sum_{i = 1}^{n} {sw}_{i}]}^{- 1} {q (y_{i} - 1) + 0.5 [q (y_{i}) - q (y_{i} - 1)]}$ where sw_i denotes the sampling weight of individual i and $q (y_{i}) = \sum_{k = 1}^{n} 1 (y_{k} \leq y_{i}) {sw}_{k}$ equals the proportion of individuals with at least the value y_i (Van Ourti, 2004). While conindex automatically adjusts for ties in computing the point estimate, it purposefully does not do so in generating the standard error. This is because two individuals with the same value of the ranking variable may or may not be entirely independent observations. In the former case, one should not correct the standard errors, but if the observations are dependent, because they belong to the same household for example, then the cluster option should be used.^²⁵ The occurrence of ties in the ranking variable is similar to the case of grouped data estimation of the standard concentration index. With grouped data, one row in the data matrix will include group mean values of the variable of interest and the ranking variable, as well as the sample weight indicating the relative size of the group. One can apply conindex directly to such grouped data.

6 The conindex command

6.1 Syntax

The syntax for conindex is:

conindex varname [if] [in] [fweight aweight pweight], [, rankvar(varname) robust cluster (varname) truezero generalized bounded limits(numlist) wagstaff erreygers v(#) beta(#) graph loud compare(varname) keeprank(string) ytitle(string) xtitle(string)]

by varlist: is allowed and can be used to calculate indices for groups defined by multiple variables.

6.2 Description

conindex computes a range of rank-dependent inequality indices, including the Gini coefficient, the concentration index, the generalized (Gini) concentration index, the modified concentration index, the Wagstaff and Erreygers normalised concentration indices for bounded variables, and the distributionally sensitive extended and symmetric concentration indices (and their generalized versions). There is no default index. Options define the index to be computed. (Generalized) Lorenz and (generalized) concentration curves can be obtained using the graph option. The default axis labels can be replaced by using the xtitle(string) and ytitle(string) options.

For unbounded variables (i.e. those with at least one infinite bound), the option truezero should be specified if the variable of interest is ratio-scale (or fixed) and has a zero lower limit, in which case the standard concentration index is calculated. If instead the variable of interest is cardinal (with the zero point fixed arbitrarily), then the theoretical lower limit must be specified using the limits(#) option where # is the minimum value. Note that one should not use the lowest value observed in the sample if this does not correspond to the theoretical lower bound. Specification of this option results in calculation of the modified concentration index.

The generalized concentration index derives from specifying the option generalized in conjunction with the option truezero.

For bounded variables (i.e. those with both a finite lower and upper bound) the option bounded can be specified in conjunction with limits(#1 #2) where #1 and #2 denote the theoretical minimum and maximum values of the variable of interest. The inequality indices are then calculated based on the standardized version of the variable of interest, h* = ((h − #1) / (#2 − #1)). and hence will be scale invariant.

The normalised concentration indices proposed by Wagstaff (2005) and Erreygers (2009a) may be obtained by specifying the wagstaff and erreygers option respectively in conjunction with the bounded and limits(#1 #2) options.

When a ranking variable is not provided using the rankvar option, conindex defaults to use varname to rank observations, leading to the calculation of uni-dimensional inequality indices (e.g. the Gini coefficient).

The extended concentration index is computed with the addition of the options truezero and v(#), where # is the distributional sensitivity parameter. With v(2) the extended concentration index is equivalent to the standard concentration index.

The symmetric concentration index is obtained with the options truezero and beta(#). With beta(2) the symmetric concentration index is equivalent to the standard concentration index.

The generalized version of the extended and symmetric concentration indices are obtained by combining the options v(#) and beta(#) with the options truezero and generalized.

Robust and clustered corrected standard errors can be obtained with the usual options.

The value of an index can be compared across groups defined by a single variable (e.g. urban), and the null of homogeneity tested, using the compare option. The prefix bys varlist:: can be used to calculate the indices for groups defined by multiple variables (e.g. urban and hhsize).

The fractional rank may be preserved using the option keeprank(string) where string is the name given to the rank variable created.

6.3 Options

rankvar variable by which individuals are ranked. Must be at least an ordinal variable. When a ranking variable is not provided using the rankvar option, conindex defaults to using varname to rank observations, leading to the calculation of uni-dimensional inequality indices (e.g. the Gini coefficient).

cluster (varname) requests standard errors that allow for intragroup correlation.

robust requests Huber/White/sandwich standard errors.

truezero declares that the variable of interest is ratio-scaled (or fixed), leading to computation of the standard concentration index.

generalized requests the generalized concentration (Gini) index measuring absolute inequality. This option can only be used in conjunction with truezero.

v(#) requests the extended concentration index be computed. This option can only be used in conjunction with truezero. With v(2) the standard concentration index is computed. If the options v(#), truezero and generalized are specified, one obtains the generalized extended concentration index. In the latter case, with v(2), the extended concentration index simplifies to the Erreygers index.

beta(#) requests the symmetric concentration index be computed. This option can only be used in conjunction with truezero. With beta(2) the standard concentration index is computed. If the options beta(#), truezero and generalized are specified, one obtains the generalized symmetric concentration index. In the latter case, with beta (2), the symmetric concentration index leads to the Erreygers index.

bounded specifies that the dependent variable is bounded. This option must be used in conjunction with the limits option.

limits(#1 #2) must be used to specify the theoretical minimum (#1) and maximum (#2) for bounded variables. If the options bounded and truezero are not specified then limits(#1) should be used to specify the minimum value to obtain the modified concentration index.

wagstaff in conjunction with bounded and limits (#1 #2) requests the Wagstaff index.

erreygers in conjunction with bounded and limits(#1 #2) requests the Erreygers index.

graph requests that a concentration curve be displayed. If no ranking variable is specified, a Lorenz curve is produced. In conjunction with generalized, one obtains the generalized Lorenz or concentration curve.

loud shows the output from the regression used to generate the inequality indices.

keeprank(string) creates a new variable which contains the fractional ranks, where string is the name of the variable to be created. When used in conjunction with the compare option, the variable string will contain the fractional rank for the full sample and the suffix k is added to string to indicate the fractional rank for group k.

compare(varname) computes indices specific to groups specified by varname. Two tests of the null hypothesis of equality of the index values across groups are produced: an F-test that is valid in small samples but requires an assumption of equal variances across groups (Chow, 1960) and a z-test that relaxes the assumption of equal variances but is valid only in large samples (Clogg et al., 1995)). If varname is not binary, then only the F-test is given.

6.4 Saved results

Scalars

r(N)	Number of observations
r(Nunique)	Number of unique observations for rankvar
r(CI)	Concentration index
r(CIse)	Standard error of concentration index
r(SSE_unrestricted)	Unrestricted sum of squared errors (with compare option)
r(SSE_restricted)	Restricted sum of squared errors (with compare option)
r(F)	F- statistic for joint hypothesis that concentration index is same for all groups (with compare option)
r(CI0)	Concentration index for group 0 (with compare option if only two groups)
r(CI1)	Concentration index for group 1 (with compare option if only two groups)
r(CIse0)	Standard error of concentration index for group 0 (with compare option if only two groups)
r(CIse1)	Standard error of concentration index for group 1 (with compare option if only two groups)
r(Diff)	Difference in concentration index between groups (with compare option if only two groups)
r(Diffse)	Standard error of difference in concentration index between groups (with compare option if only two groups)
r(z)	z- statistic for hypothesis that concentration index is same for both groups (with compare option if only two groups)

7 conindex: example applications

We illustrate the functionality of conindex through examples using data from the 2010 Demographic and Health Survey (DHS) of Cambodia, which can be obtained from http://www.dhsprogram.com/. The Cambodian DHS covers a representative sample of women aged between 15 and 49 years. It asks each participating woman about her pregnancies in the last ten years and also collects information at the household level. We construct a dataset of households to estimate inequality in the distribution of health care expenditures and a dataset of births to estimate inequality in infant mortality. Inequality in each variable is examined in relation to a wealth index (wealthindex) that is obtained from a principal components analysis of the households' possession of a battery of assets and durables, as well as housing materials (Filmer and Pritchett, 2001). This index has an ordinal interpretation and is used as the ranking variable.

In the household dataset, we construct a measure of health care expenditure per capita (healthexp) by summing out-of-pocket medical spending in the last month across individuals within the household and dividing by household size.^²⁶ This measure will serve as an example of an unbounded variable with ratio-scale. From the child dataset we construct a binary indicator of infant mortality (u1mr) that indicates whether each child born during the last 10 years did not survive to its first birthday.^²⁷ Results below indicate that around 6 percent of children die within a year of birth. Average per capita monthly health expenditure is about 12,000 riel (€2.40), but the median value of 0 riel and the high maximum of 14,500,000 riel show that the distribution of health expenditures is right skewed.

sum u1mr [aweight= sampweight]

Variable	Obs	Weight	Mean	Std. Dev.	Min	Max
u1mr	14598	14588.3669	.0606938	.238776	0	1

sum healthexp[aweight = sampweight_hh]

Variable	Obs	Weight	Mean	Std. Dev.	Min	Max
healthexp	15667	75391.2524	12010.62	116693	0	1.45e+07

Figure 2 shows that more than 8 percent of Cambodian children in the lowest wealth quintile group die before they reach their first birthday. This is more than three times greater than the rate of infant mortality in the richest wealth quintile group. The mortality rate declines, but not monotonically, in moving to higher wealth groups. Health expenditure rises from around 7,000 riel per capita in the lowest wealth quintile group to more than 22,500 riel in the top group.

An external file that holds a picture, illustration, etc. Object name is nihms-756115-f0002.jpg

Infant mortality rate (left panel) and mean health care expenditure per capita (right panel) over wealth quintiles in Cambodia DHS, 2010

xtile wealthquint = wealthindex [pweight=sampweight], n(5)
xtile wealthquint_hh = wealthindex [pweight=sampweight_hh], n(5)
graph bar (mean) u1mr [pweight= sampweight], over(wealthquint)
graph bar (mean) healthexp [pweight= sampweight_hh], over(wealthquint_hh)

Table 1 summarizes all of the indices discussed below. The concentration index of 0.248 confirms that medical spending is heavily concentrated among better-off sample households identified by a higher position in the wealth index distribution. As well as the point estimate, conindex returns a cluster adjusted standard error and a p-value for a test that the index equals zero. In this example, the null is strongly rejected (p <0.001). conindex can be used to graph a concentration curve by adding the option graph.^²⁸

Table 1

Concentration indices estimated from the Cambodian Demographic and Health Survey, 2010

	Health expenditure (healthexp)		Infant mortality (u1mr)		Infant survival (u1sr)
Standard concentration index (C)	0.2479 (0.0725)		−0.1889 (0.0255)		0.0122 (0.0016)
Generalized concentration index (GC)	2,977 (870)		−0.0115 (0.0015)		0.0115 (0.0015)
Erreygers Index (E)			−0.0459 (0.0062)		0.0459 (0.0062)
Wagstaff Index (W)			−0.2011 (0.0271)		0.2011 (0.0271)
v, β	1.5	5	1.5	5	1.5	5
Extended concentration index (EC(v))	0.1696	0.3818	−0.1201	−0.3187	0.0078	0.0206
Symmetric concentration index (SC(β)))	0.2057	0.3943	−0.1683	−0.2492	0.0109	0.0161
Generalized extended concentra- tion index (GEC(v))			−0.0492	−0.0362	0.0492	0.0362
Generalized symmetric concentra- tion index (GSG(β))			−0.0409	−0.0605	0.0409	0.0605

conindex healthexp [aweight=sampweight_hh], rankvar(wealthindex) truezero cluster(PSU) graph ytitle(cumula > tive share of healthexp) xtitle(rank of wealthindex)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
CI	15667	.24786719	.07246288	0.0007

Figure 3 reveals that there is no ambiguity in the distribution of health expenditures. The concentration curve always lies below the diagonal indicating greater spending by those ranked higher according to the wealth index.

An external file that holds a picture, illustration, etc. Object name is nihms-756115-f0003.jpg

Concentration curve for out-of-pocket health care expenditure per capita against wealth index rank, Cambodia (DHS, 2010)

As health care expenditures are unbounded and measured on a ratio-scale, this estimate is robust to the proportionality factor arising from the choice of currency and can be used to rank inequality in medical spending in Cambodia against inequalities in other ratio-scale variables (e.g. food expenditures) or health expenditure inequality in other countries. If one prefers that the measure of inequality in medical spending respect absolute invariance rather than relative invariance, then the generalized concentration index can be requested by:

conindex healthexp [aweight=sampweight_hh], rankvar(wealthindex) generalized truezero cluster(PSU)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
generalized CI	15667	2977.0381	870.32394	0.0007

This gives an estimate of around 2,977 riel, which is obviously sensitive to the proportionality factor and cannot be used directly to compare inequality in medical spending across countries with different currencies.^²⁹

The standard concentration index of infant mortality is negative indicating that infant deaths are concentrated among less wealthy households. The index for infant survival (u1sr) is correspondingly positive but differs greatly in absolute value from the index for mortality confirming that the mirror property does not hold and reflecting imposition of relative invariance with respect to different variables (see also section 3.2). Given the standard concentration index is insensitive to a proportional transformation of the variable of interest, the value used to indicate presence of a characteristic, e.g. death=1, is irrelevant provided the value used to indicate absence of that characteristic is fixed at zero.^³⁰

conindex u1mr [aweight= sampweight], rankvar(wealthindex) truezero cluster(PSU)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
CI	14598	−.18890669	.02546028	0.0000

conindex u1sr [aweight= sampweight], rankvar(wealthindex) truezero cluster(PSU)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
CI	14598	.01220632	.00164513	0.0000

The generalized concentration indices of mortality and survival are equal in absolute value (see Table 1), confirming that this index satisfies the mirror condition when it is applied to a binary variable. This is because the generalized concentration index for a binary variable equals one fourth of the Erreygers index, which possesses the mirror property. But the generalized concentration index does not satisfy this condition in general. The Erreygers index is computed by specifying the eponymous option, along with two further options that indicate the variable is bounded and how it is coded.

conindex u1mr [aweight= sampweight], rankvar(wealthindex) erreygers bounded limits(0 1) cluster(PSU)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
Erreygers normalised CI	14598	−.04586189	.00618113	0.0000

conindex u1sr [aweight= sampweight], rankvar(wealthindex) erreygers bounded limits(0 1) cluster(PSU)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
Erreygers normalised CI	14598	.04586189	.00618113	0.0000

The Wagstaff index of infant mortality, which as explained above has different normative underpinnings, is computed by simply specifying 'wagstaff' in place of 'erreygers'.

conindex u1mr [aweight= sampweight], rankvar(wealthindex) wagstaff bounded limits(0 1) cluster(PSU)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
Wagstaff normalised CI	14598	−.20111301	.02710541	0.0000

conindex u1sr [aweight= sampweight], rankvar(wealthindex) wagstaff bounded limits(0 1) cluster(PSU)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
Wagstaff normalised CI	14598	.20111301	.02710541	0.0000

The value of the Wagstaff index is close to that of the standard concentration index because the prevalence of infant deaths, at 6.1 percent, is close to zero and so the index places greater weight on relative invariance with respect to presence of the characteristic (here, death) and so comes closer to the normative principle imposed by the standard concentration index. If the prevalence were 50 percent, then the Wagstaff index would give equal weight to relative invariance in attainments and shortfalls, which coincides with absolute invariance. In that case, its value would equal that of the Erreygers index (see Kjelsson arid Gerdtham (2013) for more discussion).

The bottom panel of Table 1 presents estimates of concentration indices with alternative attitudes to inequality to those underlying the standard concentration index. Setting the parameter v of the extended concentration index to 1.5 places relatively more weight on those residing in wealthier households, while setting the parameter to 5 gives more weight to the poorer observations. A value of 2 corresponds to the weighting implicit in the standard concentration index and so would result in an estimate equal to that of C. We use the same values for the β parameter of the symmetric index, where β =1.5 corresponds to the case where more weight is placed on the middle of the wealth distribution, while β = 5 corresponds to a case where the extremes of the wealth distribution are more heavily weighted.^³¹

The indices are computed as follows:

conindex varlist [aweight = sampweight], rankvar(wealthindex) truezero v(#) cluster (PSU)
conindex varlist [aweight= sampweight], rankvar(wealthindex) truezero beta(#) cluster(PSU)

It is important to emphasise that little can be learnt from comparing extended indices computed for different values of v.^³² Rather, one might check whether an inequality-ordering across populations is robust to the choice of the value of v (β). If it is not, then a conclusion that a variable is more unequally distribution in one population than another needs to be made conditional on an explicit attitude towards inequality.

Generalized extended and symmetric indices are computed by simply adding the option generalized to the command lines immediately above. As is clear from the estimates in table 1, these indices satisfy- the mirror condition.^³³

conindex varlist [aweight= sampweight], rankvar(wealthindex) generalized truezero v(#) cluster(PSU)
conindex varlist [aweight= sampweight], rankvar(wealthindex) generalized truezero beta(#) cluster(PSU)

conindex allows estimates of all inequality indices to be compared across groups defined by a binary or categorical variable and it tests the null of equality across groups. This is done by including the compare option. For example, to compare wealth-related inequality in infant mortality across urban and rural locations, we can use:

conindex u1mr [aweight= sampweight], rankvar(wealthindex) erreygers bounded limits(0 1) cluster(PSU) compare(urban)

Index:	No. of obs.	Index value	Robust Std. Error	p-value
Erreygers normalised CI	14598	−.04586189	.00618113	0.0000

For groups:

CI for group 1: urban = 0

Index:	No. of obs.	Index value	Robust Std. Error	p-value
Erreygers normalised CI	10969	−.02985274	.00724954	0.0000

CI for group 2: urban = 1

Index:	No. of obs.	Index value	Robust Std. Error	p-value
Erreygers normalised CI	3629	−.02869979	.01006162	0.0048

Test for statistically significant differences with Ho: diff=0 (assuming equal variances)

F - stat = .66985873

p-value= 0.4131

Test for statistically significant differences with Ho: diff=0(large sample assumed)

Difference = .00115295

Standard error = .01240129

z-stat= 0.09

p-value= 0.9259

The index estimated from the combined sample is displayed first. Then the group specific estimates are given. There is a significant concentration of infant mortality among the least wealthy in both rural and urban locations. The point estimates suggest that the degree of inequality is greatest in rural areas, but the difference with urban areas is small. Both tests fail to reject the null hypothesis that the index is the same in rural and urban locations.

8 Concluding remarks

This article introduces the user written Stata command conindex, which calculates rank-dependent inequality indices while offering a great deal of flexibility in taking account of measurement scale and alternative attitudes to inequality. Estimation and inference is via a regression approach that can allow for sampling design, misspecification and grouped data, as well as testing for differences in inequality across populations.

Concentration indices are in frequent use, particularly for the measurement of inequality in health by socioeconomic status. The indices estimated for different regions, periods or groups could also be included in regression analyses as control variables. We hope that the greatly reduced computational cost offered by conindex will afford researchers the time to give greater consideration to their choice of index, ensuring that the one selected is appropriate for the scale of measurement and is consistent with the normative position they are prepared to defend.

Acknowledgments

Owen O'Donnell and Tom Van Ourti acknowledge support from the National Institute on Ageing, under grant R01AG037398. We thank Ellen Van de Poel for assistance with the DHS data. The usual caveats apply and all remaining errors are our responsibility.

11 About the authors

Owen O'Donnell is a Professor of Applied Economics in the Erasmus School of Economics at Erasmus University Rotterdam, a Research Fellow of the Tinbergen Institute and an Associate Professor at the University of Macedonia (Greece).

Stephen O'Neill (corresponding author) is a Research Fellow in Health Economics in the Department of Health Services Research and Policy at the London School of Hygiene and Tropical Medicine.

Tom Van Ourti is a Professor of Applied Health Economics in the Erasmus School of Economics at Erasmus University Rotterdam and a Research Fellow of the Tinbergen Institute.

Brendan Walsh is a Research Fellow in Health Economics in the School of Health Sciences, and the City Health Economics Centre at City University London.

Footnotes

¹A concentration index of zero can arise either because health does not vary with income rank or because the concentration curve crosses the 45° line and pro-poor inequality in one part of the income distribution is exactly offset by pro-rich inequality in another part of the distribution.

²The fractional rank varies between $\frac{1}{2 n}$ and $1 - \frac{1}{2 n}$ if there are no ties. In the case of ties, it equals the mean fractional rank of those individuals with the same value for y_i .

³If given a welfare interpretation, then C respects the principle of income-related health transfers - social welfare falls when the health of a lower income individual is reduced and the health of a higher income individual is raised by the same magnitude (Bleichrodt and van Doorslaer, 2006).

⁴ C respects the principle of income-related health transfers. This is analogous to the principle of transfers for G and requires that a health transfer from a low to a high income individual will lower social welfare (Bleichrodt and Van Doorslaer, 2006).

⁵The graphical representation of the generalized concentration index corresponding to Figure 1 is the generalized concentration curve. According to the generalized concentration dominance criterion (Shorrocks, 1983), a distribution that has a higher mean cannot be dominated by one with a lower mean. But the ordering of two distributions by their generalized concentration indices does not necessarily correspond to their ordering by their means.

⁶A function f is invariant under transformation g if f (g (x)) = f (x). A function is equivariant if f(g(x)) = ∣∂g/∂x∣f(x).

⁷Measurement of inequality in nominal and ordinal variables is not feasible using rank-dependent indices (Erreygers and Van Ourti, 2011).

⁸Assuming that h_i = βx_i , one obtains $C (h ∣ y) = 2 cov (\frac{h_{i}}{\bar{h}}, R_{i}) = 2 cov (\frac{β x_{i}}{β \bar{x}}, R_{i}) = C (x ∣ y)$ and GC(h ∣y) = 2cov(h _i,R _i) = 2cov(β x _i,R _i) =β GC(x ∣y).

⁹We assume monotone transformations and, hence, the proportional scaling factor is positive.

¹⁰We restrict attention to positive linear transformations, i.e. β must be positive in h_i = α + βx_i .

¹¹For discussion of the issues, particularly in relation to the measurement of health inequality, see Clarke et al., 2002; Wagstaff, 2005; Erreygers, 2009a,b,c; Wagstaff, 2009; Erreygers and Van Ourti, 2011a-b; Wagstaff, 2011a-b; Kjellsson and Gerdtham, 2013a-b.

¹²Lambert and Zheng (2011) suggested a weaker condition requiring that the inequality ordering of populations by attainments is strictly the reverse of that by shortfalls.

¹³The same holds for the modilied concentration index.

¹⁴Bosmans (2013) shows how one can overcome the impossibility of satisfying both conditions if one allows for different functional forms for the inequality index for attainments and the inequality index for shortfalls.

¹⁵Wagstaff (2005) focused on the binary case.

¹⁶ R_i for this individual is: $\frac{2 i - 1}{2 n} = \frac{2 (\frac{n + 1}{2}) - 1}{2 n} = \frac{1}{2}$ , hence the weight is 2(½) − 1 = 0

¹⁷When one uses a finite number of observations to calculate the extended concentration index and v ≠ 2, a small-sample bias arises. Erreygers et al. (2012) develop an alternative way to calculate the extended concentration index (and its generalized version) to address this small-sample bias. Their approach is applied in the command conindex. We refer the reader to the appendix in Erreygers et al. (2012) for more details.

¹⁸ EC ranges between 1 – v and 1 when n → +∞. For a Unite value of n, the lower and upper limit of EC are $1 - v {(\frac{2 n - 1}{2 n})}^{v - 1}$ and $1 - v {(\frac{1}{2 n})}^{v - 1}$ .

¹⁹Similar to the extended concentration index, a small-sample bias arises when β ≠ 2 (see ^{footnote 17}). We have implemented the approach explained in the appendix of Erreygers et al. (2012) in the command conindex. Note that the approach is also used for the generalized version.

²⁰This range is only entirely correct when n → +∞. For a finite number of observations, the range equals $[β 2^{β - 2} {[{(\frac{1 - n}{2 n})}^{2}]}^{\frac{β - 2}{2}} (\frac{1 - n}{2 n}); β 2^{β - 2} {[{(\frac{n - 1}{2 n})}^{2}]}^{\frac{β - 2}{2}} (\frac{n - 1}{2 n})]$ .

²¹Equations (6) and (7) reveal that the symmetric and extended concentration index consider health shares, and hence are sensitive to relative health differences. The 'absolute' counterparts of these indices have not explicitly been introduced by Erreygers et al. (2012) (or Pereira (1998) and Wagstaff (2002) in the case of the extended concentration index), but are trivially derived by replacing the health shares by the health levels. Similarly, the measurement scale of unbounded variables is important for the extended and symmetric indices, but the discussion essentially mimics that in section 3.1. Modifications such as in sections 3.1 and 3.2 can be derived. Since these modification are not integrated in the conindex command, we do not discuss these indices explicitly.

²²Extensions of these indices that simultaneously satisfy the mirror condition and the inequality invariance criterion underlying the Wagstaff index are not discussed since these have not been introduced in the literature before. However, in principle, it would be feasible to derive such indices.

²³ conindex does not take account of the sampling variability of the estimate of the mean of the variable of interest used in constructing the dependent variable of the regression. Typically, this makes very little difference to the standard error of an estimated concentration index. The command calculates the population formula for $σ_{R}^{2}$ , and not the sampling formula, i.e. there is no degrees of freedom correction. The command does not implement the approach of Kakwani et al. (1997) to account for serial correlation because this approach has not been extended to also allow for sample design issues such as clustering. For more details on these issues, see chapter 8 of O'Donnell et al. (2008).

²⁴One should replace $\frac{2 σ_{R}^{2}}{\bar{h}}$ in equation (10) by $2 σ_{R}^{2}$ for GC, by $\frac{2 σ_{R}^{2}}{\bar{h} - h^{\min}}$ for MC, $\frac{8 σ_{R}^{2}}{\bar{a} - a^{\min}}$ for E, and $\frac{2 σ_{R}^{2} (a^{\max} - a^{\min})}{(a^{\max} - \bar{a}) (\bar{a} - a^{\min})}$ for W.

²⁵Where ties occur between observations in different clusters, clustered standard errors may be unstable since they are obtained from a regression at the group level with groups defined by the unique values of the ranking variable.

²⁶For each ill or injured household member, the respondent was asked to state the costs expended for transportation and treatment for each visit to a health care provider (for up to 3 visits and without differentiating between outpatient and inpatient care). These costs were reported only for living people who had been ill or injured during the last month and did not include costs incurred for people who had died in the 30 days preceding the interview.

²⁷For the summary statistics of health care expenditures, this implies that we consider the individual as the ultimate unit of observation, even though expenditures are measured at the household level. For the concentration indices of health expenditures, it also implies that household size will influence the fractional ranks. As shown by Ebert (1997), and illustrated by Decoster and Ooghe (2003) on income data, this has important normative implications in terms of the axioms of anonymity and principle of transfer (both of which would be violated if each household were weighted equally independently of its size). In practice, this means we report estimates based on 15,667 observations, but application of sampweight_hh ensures these are representative of 75,391 individuals.

²⁸When graphing, conindex defaults to use the variable label, or if unavailable, the variable name when labelling the axis. This can be overruled by specifying the xtitle( ) and ytitle( ) options and specifying the desired axis labels inside the parentheses. When rankvar() is not specified, conindex draws the Lorenz curve. Note also that the generalized concentration (Lorenz) curve will be drawn when the generalized option is also specified.

²⁹The univariate Gini and generalized Gini index are obtained by omitting the rankvar() option in the conindex command.

³⁰Consult Erreygers and Van Ourti (2011a-b) and Wagstaff (2011a-b) for some discussion on this issue.

³¹There is no particular reason to choose the same values for v and β. Our reason for doing so is that both v and β can be interpreted as the distance between the weights given to the least and most wealthy individual. See also sections 4.1 and 4.2.

³²For example, while not occuring in the illustration in this paper, Erreygers et al. (2012) report some empirical examples where initially pro-poor inequality reverses into pro-rich inequality when v is increased. The same reversal might also happen for the symmetric index.

³³When v= β = 2, both indices are equal to the Erreygers index.

Contributor Information

Owen O'Donnell, Erasmus School of Economics Erasmus University Rotterdam, the Netherlands Tinbergen Institute, the Netherlands and University of Macedonia, Greece.

Stephen O'Neill, Department of Health Services Research and Policy-London School of Hygiene and Tropical Medicine, UK ku.ca.inthsl@llieno.nehpets.

Tom Van Ourti, Erasmus School of Economics Erasmus University Rotterdam, the Netherlands and Tinbergen Institute, the Netherlands.

Brendan Walsh, Division of Health Services Research and Management School of Health Sciences and City Health Economics Centre City University London, UK.

10 References

Bleichrodt H, Van Doorslaer E. A welfare economics foundation for health inequality measurement. Journal of Health Economics. 2006;25:945–957. [PubMed] [Google Scholar]
Bosmans K. Consistent comparisons of attainment and shortfall inequality: a critical examination. GSBE Maastricht University; 2013. 13/064. [PubMed] [Google Scholar]
Chen Z. Concindc: Stata module to calculate concentration index with both individual and grouped data. http://EconPapers.repec.Org/RePEc:boc:bocode:s456802.
Chow GC. Tests of Equality Between Sets of Coefficients in Two Linear Regressions. Econometrica. 1960;28:591–605. [Google Scholar]
Clarke PM, Gerdtham UG, Johannesson M, Bingefors K, Smith L. On the measurement of relative and absolute income-related health inequality. Social Science and Medicine. 2002;55:1923–1928. [PubMed] [Google Scholar]
Clogg CC, Potkova E, Haritou A. Statistical Methods for Comparing Regression Coefficients between Models. American Journal of Sociology. 1995;100:1261–93. [Google Scholar]
Decoster A, Ooghe E. Weighting with individuals, equivalent individuals or not weighting at all. Does it matter empirically? In: Bishop J, Amiel Y, editors. Inequality, welfare and poverty: theory and measurement. 2003. pp. 173–190. Research on Economic Inequality Series 6. [Google Scholar]
Ebert U. Social welfare when needs differ: an axiomatic approach. Economica. 1997;64:233–244. [Google Scholar]
Erreygers G. Correcting the Concentration Index. Journal of Health Economics. 2009a;28:504–515. [PubMed] [Google Scholar]
Erreygers G. Correcting the Concentration Index: a reply to Wagstaff. Journal of Health Economics. 2009b;28:521–524. [PubMed] [Google Scholar]
Erreygers G. Can a single indicator measure both attainment and shortfall inequality? Journal of Health Economics. 2009c;28:885–893. [PubMed] [Google Scholar]
Erreygers G, Clarke P, van Ourti T. Mirror, mirror on the wall, who in this land is fairest of all? Distributional sensitivity in the measurement of socioeconomic inequality in health. Journal of Health Economics. 2012;31:257–270. [PMC free article] [PubMed] [Google Scholar]
Erreygers G, Van Ourti T. Measuring socioeconomic inequality in health, health care and health financing by means of rank-dependent indices: A recipe for good practice. Journal of Health Economics. 2011a;30:685–694. [PMC free article] [PubMed] [Google Scholar]
Erreygers G, Van Ourti T. Putting the cart before the horse. A comment on Wagstaff on inequality measurement in the presence of binary variables. Health Economics. 2011b;20:1161–1165. [PMC free article] [PubMed] [Google Scholar]
Filmer D, Pritchett L. Estimating wealth effects without expenditure data - or tears: An application to educational enrollments in states of India. Demography. 2001;38:115–132. [PubMed] [Google Scholar]
Gwatkin DR, Rustein S, Johnson K, Pande R, Wagstaff A. Initial country level information about socio-economic differentials in health, nutrition and population. umes I and II. World Bank Health, Population and Nutrition; Washington, DC: 2003. [Google Scholar]
Jenkins SP. Calculating income distribution indices from micro-data. National Tax Journal. 1988:139–142. [Google Scholar]
Jenkins SP. INEQDECO: Stata module to calculate inequality indices with decomposition by subgroup. 2010 http://ideas.repec.org/c/boc/bocode/s366002.html.
Jenkins SP, Van Kerm P. Software update: Generalized Lorenz curves and related graphs. Stata Journal. 2007;7(2):280. [Google Scholar]
Jolliffe D, Krushelnytskyy B. Bootstrap standard errors for indices of inequality: INEQERR. Stata Technical Bulletin. 1999;51:2832. [Google Scholar]
Kakwani NC. Measurement of tax progressivity: an international comparison. Economic Journal. 1977;87:71–80. [Google Scholar]
Kakwani NC. Income inequality and poverty: methods of estimation and policy applications. Oxford University Press; New York: 1980. [Google Scholar]
Kakwani N, Wagstaff A, van Doorslaer E. Socioeconomic inequalities in health: Measurement, computation, and statistical inference. Journal of Econometrics. 1997;77:87–103. [Google Scholar]
Kjellsson G, Gerdtham U-G. On correcting the concentration index for binary variables. Journal of Health Economics. 2013a;32:659–670. [PubMed] [Google Scholar]
Kjellsson G, Gerdtham U-G. Lost in Translation: Rethinking the Inequality Equivalence criteria for bounded health variables. In: O'Donnell O, Rosa Dias P, editors. Research on Income Inequality volume 21: health and inequality. Emerald; 2013b. pp. 3–32. [Google Scholar]
Kolm S. Unequal inequalities I. Journal of Economic Theory. 1976;12:416–442. [Google Scholar]
Lambert P, Zheng B. On the consistent measurement of attainment and shortfall inequality. Journal of Health Economics. 2011;30:214–219. [PubMed] [Google Scholar]
Lerman R, Yitzhaki S. Improving the accuracy of estimates of Gini coefficients. Journal of Econometrics. 1989;42:43–47. [Google Scholar]
Lopez-Feldman A. DESCOGINI: Stata module to perform Gini decomposition by income source. 2008 http://idcas.ropoc.org/c/boc/bocodo/s456001.html.
O'Donnell O, van Doorslaer E, Wagstaff A, Lindelow M. Analyzing Health Equity Using Household Survey Data: A Guide to Techniques and Their Implementation. World Bank Institute; Washington, D.C: 2008. [Google Scholar]
Pereira JA. Inequality in infant mortality in Portugal, 1971-1991. In: Zweifel P, editor. Health, the medical profession, and regulation. Developments in health economics and public policy. Volume 6. Boston/Dordrecht/London; Kluwer: 1998. pp. 75–93. [PubMed] [Google Scholar]
Shorrocks A. Ranking income distributions. Economics. 1983;50:3–17. [Google Scholar]
Townsend P, Davidson N. Inequalities in Health: The Black Report. Penguin; Harmondsworth: 1982. [Google Scholar]
Van Doorslaer E, Masseria C, Koolman X, the OECD Health Equity Research Group Inequalities in access to medical care by income in developed countries. Canadian Medical Association Journal. 2006;174:177–183. [PMC free article] [PubMed] [Google Scholar]
Van Doorslaer E, Wagstaff A, Bleichrodt S, et al. Income-related inequalities in health: some international comparisons. Journal of Health Economics. 1997;16:93–112. [PubMed] [Google Scholar]
Van Ourti T. Measuring horizontal inequity in Belgian health care using a Gaussian random effects two part count data model. Health Economics. 2004;13:705–724. [PubMed] [Google Scholar]
Wagstaff A. Socioeconomic inequalities in child mortality: comparisons across nine developing countries. Bulletin of the World Health Organization. 2000;78:19–29. [PMC free article] [PubMed] [Google Scholar]
Wagstaff A. Inequality aversion, health inequalities, and health achievement. Journal of Health Economics. 2002;21:627–641. [PubMed] [Google Scholar]
Wagstaff A. The bounds of the concentration index when the variable of interest is binary, with an application to immunization inequality. Health Economics. 2005;14:429–432. [PubMed] [Google Scholar]
Wagstaff A. Correcting the Concentration Index: A comment. Journal of Health Economics. 2009;28:516–520. [PubMed] [Google Scholar]
Wagstaff A. The concentration index of a binary outcome revisited. Health Economics. 2011a;20:1155–1160. [PubMed] [Google Scholar]
Wagstaff A. Reply to Guido Erreygers and Tom Van Ourti's comment on the concentration index of a binary outcome revisited. Health Economics. 2011b;20:1166–168. [PubMed] [Google Scholar]
Wagstaff A, Paci P, van Doorslaer E. On the measurement of inequalities in health. Social Science and Medicine. 1991;33:545–557. [PubMed] [Google Scholar]
Wagstaff A, van Doorslaer E, Paci P. Equity in the Finance and Delivery of Health Care: Some Tentative Cross-Country Comparisons. Oxford Review of Economic Policy. 1989;5:89–112. [Google Scholar]
Wagstaff A, Watanabe N. What difference does the choice of SES make in health inequality measurement? Health Economics. 2003;12:885–90. [PubMed] [Google Scholar]
Yitzhaki S. On an extension of the Gini inequality index. International Economic Review. 1983;24:617–628. [Google Scholar]