Inequality in health versus inequality in lifestyles

Repeated Norwegian cross-sectional data for the period 2005 to 2011 are used to compare sources of inequality in health, as represented by self-assessed health and obesity, with sources of inequality in lifestyles that are central to the production of health, as represented by physical activity, cigarette smoking and dietary behavior. Sources of overall inequality and socioeconomic inequality in these lifestyle and health indicators are compared by estimating probit models, and by decomposing the explained part of the associated Gini and concentration indices with respect to education and income. As potential sources of inequality, we consider education, income, occupation, age, gender, marital status, psychological traits and childhood circumstances. Our results suggest that sources of inequality in health are not necessarily representative of sources of inequality in underlying lifestyles. While education is generally an important source of overall inequality in both lifestyles and health, income is unimportant in all lifestyle indicators except physical activity. In several cases, education and income are clearly outranked by other factors in terms of explaining overall inequality, such as gender in eating fruits and vegetables and age in fish consumption. These results suggest that it is important to decompose both overall inequality and socioeconomic inequality in different lifestyle and health indicators. In indicators where other factors than education and income are clearly most important, policy makers should consider to target these factors to efficiently improve overall population health. JEL classification: D39, I12, I14

factors of health, including lifestyles, and not final health itself.In order for such policies of 'preventive medicine' to be efficient, however, there is need for more insights into patterns of inequality across several important health affecting lifestyles, including the extent to which these patterns are similar to those of final health.If patterns are homogeneous, then policies that are formulated with the intention of improving health affecting behaviours on the basis of knowledge about inequality in final health are relevant due to their 'trickle down' properties.If on the other hand patterns of inequality are significantly different across different lifestyles and final health, this no longer holds, which if true may have important implications for policy.
The main goal of this paper is therefore to directly compare patterns of inequality in health with patterns of inequality in lifestyles believed to be important in affecting health.Our intention is therefore to add to the existing literature, which has mainly focused on patterns of inequality in final health, where the role of lifestyles, where considered, has been limited to their relative contribution to total inequality or socioeconomic inequality in health, as one out of many possible sources of inequality.Since the empirical evidence generally suggest that lifestyles are important in affecting health, and furthermore, that similar factors, such as socioeconomic status, tend to be important in predicting both health related lifestyles and different health outcomes, we hypothesize that, in one single sample, patterns of total inequality and socioeconomic inequality are similar across various production factors of health, here represented by lifestyle variables, and final health itself.
Our data are drawn from the Norwegian Monitor Survey 2005-2009, which is a repeated cross section survey.Health status, or final health, is proxied by SAH and obesity, while lifestyles are represented by physical activity, smoking, and two indicators of diet quality -frequencies of eating fish and fruits and vegetables. 2 In our analysis, we first study correlates and determinants of lifestyles, obesity and SAH using a multivariate probit model, and next we study sources and patterns of inequality in these variables by decomposing Gini indices and income-and education-related CIs.As explanatory factors, and sources of inequality, we consider (i) a standard set of socio-demographic variables, including current income and education, (ii) a set of variables which represent childhood circumstances, including parental education, and (iii) a set of variables which are meant to serve as proxies for time preferences, risk aversion, and self control.Evidence on the importance of psychological traits (e.g, Heckman, 2007) and childhood circumstances (e.g., Case et al., 2005) in affecting adult lifestyles and health are accumulating.Recent evidence from the US, Britain and France on some of these issues are provided in Cutler and Lleras-Muney (2010), Rosa Dias (2009Dias ( , 2010) ) and Trannoy et al. (2010).

Data and methods
This study uses data from the Norwegian Monitor Survey, a nationally representative and repeated cross section survey of Norwegian adults which has been conducted biannually since 1985.Some of the key variables that are being analyzed in this paper are based on survey questions that were first introduced in the 2005 survey. 3Thus, in this paper, only data from the 2005, 2007 and 2009 survey rounds are being used.Individuals aged 15-95 years are recruited to participate in the survey.We restrict our sample to only include respondents aged 25-74 years, as we want to study individuals who can be expected to having completed most of their education and started earning incomes, and since few respondents in the age range 75-95 years are included in the survey.After deleting observations with missing information on any relevant variables (1995 observations), our final sample consists of 7738 observations.(OECD, 2008).
Each respondent answers an extensive list of questions, of which those related to the selected lifestyle variables and SAH are based on various types of categorical scales.For example, the respondents are asked to indicate their frequency-of-eating (i) fruits and berries and (ii) vegetables on a ten-point scale ranging from 'Never/Less than once per month' to '4 times per day'.Similarly, physical activity has an eight-point frequency-scale ranging from 'Never' to 'Once or more per day'.Yet other frequency-scales are being used for smoking and fish consumption.SAH is based on a five-point scale ranging from 'Very bad' to 'Very good'.The use of different categorical scales complicates our intention to compare these variables as well as body mass with respect to their determinants and their patterns of inequality.We therefore choose to dichotomize each of these variables, although some information is lost in doing so.Their definition along with summary statistics for other relevant variables of this study is presented in Table 1.The usual disclaimer applies with respect to strengths and limitations of using SAH and obesity as indicators of health, including, amongst others, the possibility of respondents under-reporting their weight and over-reporting their height (Connor Gorber et al., 2007).
Our methodological approach to comparing sources and patterns of inequality in lifestyle and health variables is inspired by related empirical literature, in particular Balia and Jones (2008).The determinants and correlates of lifestyles, obesity and SAH will be studied using a six-equation multivariate probit model, while measures of total inequality and income-and education-related inequality in these variables will be studied using Gini and CI decomposition techniques.
The multivariate probit model of this study includes all the variables listed in Table 1, in addition to controls for survey years.Thus, the dependent variables in this model are our six lifestyle and health variables, while control groups include basic demographic variables, income, education, occupational status, psychological traits, childhood conditions, and survey years. 4We use identical regressors in all six lifestyle and health equations.Thus, we do not estimate a recursive system in which lifestyles are assumed to affect health, as in for example Balia and Jones (2008).There are two main reasons for this approach.First, our main interest lies in directly comparing important lifestyle and health variables with respect to their determinants and patterns of inequality, rather than in assessing the actual impact that different lifestyles are having on health.Second, unlike in Balia and Jones (2008), our data is not longitudinal, which means that we are only able to assess the impact of current lifestyles on current health, an approach which is mainly relevant for the unknown respondents for whom current lifestyles are good proxies of past lifestyles, as the impact of lifestyles on health is not immediate, but rather, is the result a long-lasting, cumulative process.
As our multivariate probit model for lifestyles and health is not recursive, its main advantage over single equation probit models is its ability to estimate correlation coefficients between error terms of the different equations in the system.Thus, with the multivariate probit model, we can learn about the extent to which unexplained residuals of variation, or unobserved individual characteristics, are systematically related across the different lifestyle and health equations (Balia and Jones, 2008).Technically this is accomplished by utilizing properties of the multivariate normal distribution.The vector or error terms ε in the multivariate probit model is distributed multivariate standard normal, ε ~ MVN(0, Ω), with our 6×6 variance-correlation matrix Ω having values of 1 on its leading diagonal elements, and symmetrical correlation coefficients ρ jk between equations j and k on its off-diagonal elements.
Our multivariate probit model is estimated using simulated maximum likelihood, with the 7738×6 matrix of lifestyle and health probabilities being simulated using the Geweke-Hajivassiliou-Keane (GHK) simulator. 5More details on the properties and technicalities of the multivariate probit model, including advantages of using the GHK simulator, can be found in Cappellari and Jenkins (2003), Contoyannis and Jones (2004), and Balia and Jones (2008).
The Gini index is a widely used measure of total inequality in a specific variable -it measures the extent to which for example SAH is unequally distributed within a population.The Gini has range [0, 1].The CI is a closely related measure of socioeconomic-statusrelated-inequality in, say, the same variable -that is, it measures the extent to which the distribution of SAH is related to a specific measure of socioeconomic status, for example education.The standard version of the CI (Gini) may be expressed as follows; where r in the case of CI indicates the fractional rank of the chosen socioeconomic indicator, for example education, y is the other variable from which to calculate CI, for example SAH, and µ is the mean of y. 6 The CI has range [-1, 1], where 1 (-1) indicates extreme cases in which all 'good health' is found among those with the absolute highest (lowest) socioeconomic status.The Gini index of total inequality in health is obtained simply by replacing r of socioeconomic status in Eq. ( 1) by r of health, i.e., by the fractional rank of health (van Doorslaer and Jones, 2003).
Because covariances are central to both the CI and Gini (Eq.1), these indices may both be obtained, or calculated, using linear regression.An extension of this property is that they may also be decomposed into their contributing factors (Wagstaff et al., 2003).Thus, one might estimate the percentage contribution of e.g.age, gender and education to total inequality in SAH.In this paper, in Section 4, we will decompose both the Gini index of total inequality and the CIs of income-and education-related inequality in our six lifestyle and health variables.The decomposition formula for the CI (Gini) is where (β k µ k / µ k ) is the elasticity of variable k, e.g.gender, with respect to y, e.g.SAH, with µ k and µ y being the mean of k and y, and with β k being the coefficient for regressor k in a linear regression on y.CI k is the CI of variable k with respect to the chosen socioeconomic status indicator, for example education.CC ε /µ y is the generalized CI for the error term, which can be computed as a residual (Balia and Jones, 2008).Thus, as an example, in order for gender to make a substantial contribution to the explained part of the education-related CI in SAH, we must have that (i) the elasticity of gender on SAH, controlling for other factors, is large, and (ii) gender and education are strongly correlated, i.e.CI gender is large.The Gini is also decomposed using Eq. ( 2), but now with CI k representing the CI of variable k with respect to lifestyle or health variable y.
Depending on the nature of the two variables from which to calculate it, the standard CI s (Gini s ) in Eq. ( 1) may possess a few undesirable properties.As these limitations -which are being discussed in Erreygers (2009aErreygers ( , 2009b) ) -are relevant in this study, we therefore choose to instead use the following, recently developed version of the CI (Gini); where b y and a y are the upper and lower limits of y, and CI s (Gini s ) is the standard version of the CI (Gini) in Eq. ( 1) (Erreygers (2009a(Erreygers ( , 2009b))). 7The decomposition formula for this version of CI (Gini) is obtained by scaling Eq. ( 2) similarly, i.e. by 4µ/(b y -a y ).While the actual index estimates are sensitive to whether one uses the CI (Gini) versions in Eq. ( 1) or Eq. ( 3), the corresponding decomposition analysis is invariant to which version is being used, i.e. the percentage contribution of different variables to indices of inequality are identical in the two versions.

Correlates of lifestyles and health -multivariate probit model results
We start comparing patterns of inequality in lifestyles and health by looking at results of the multivariate probit model, which are reported in Table 2. 8 The key results are as follows.
First, controlling for a large set of potentially confounding factors, clear education gradients exist in all four lifestyle variables, while income is significantly associated only with PA and in part FV.For our health variables, Non-obese and SAH, the opposite seems to be true; clear income gradients are present, but the effects of education are less clear, with only the association between College degree and SAH being statistically significant.In general, and not surprisingly, the association between education and health, and between income and lifestyles, becomes stronger as we remove control variables from the probit models in Table 2.9 What remains, however, is that, in relative terms, education seems to be more important than income in predicting healthy lifestyles, while the opposite seems to be true for health.
Second, while Table 2 shows that there are several significant effects of occupational status on different lifestyle and health variables, one association clearly stands out; individuals on social security are 41.6 percentage points less likely than others to report being in good health.As noted in Footnote 4, in this study we are generally not able to establish causal effects, as our data are cross sectional.This limitation seems to be particularly relevant in this example, as we suspect that the strong negative correlation that exists between social security status and SAH is mainly due to the effect of poor SAH on social security status, and not vice versa (as modeled in Table 2).What is probably more interesting in this context is the extent to which social security status is responsible for the strong relationship that exists between income and SAH, that is, income and SAH might be strongly associated partly because poor SAH make individuals exit the labor force prematurely, which in turn affect their incomes negatively due to a shift from earning wages to being on social security.Evidence on such mechanisms of 'reverse causality' from health to income, in particular in late midlife, is provided in for example Case and Deaton (2005)  Third, our proxies for time preferences and self control are in several cases significantly related to lifestyles and health.However, no clear systematic patterns stand out with respect to these variables.For example, the results in Table 2 do not indicate that psychological traits are more important in predicting lifestyle choices than health outcomes, or vice versa.
Fourth, a similar result holds for the role of childhood circumstances, that is, we are not able to identify any systematic differences between lifestyle and health variables with respect to for example the impact of parental education.Among the included variables for childhood circumstances, being raised by a mother who had completed some form of university level education seems to be most important, as this variable is statistically significant in two out of four lifestyle equations and in both health equations, with the range of marginal probabilities being 0.042-0.058.In fact, for the two health variables, and in particular non-obesity, the education of the respondent's mother is more important than his or her own education in predicting health.
Finally, the residual error terms of the different equations in Table 2 are in several cases strongly correlated; 11 out of 15 cross-equation correlation coefficients are statistically significant at the 99 % level (of which 11 at the 99 % level), and 7 out of these are correlated in the excess of 0.150.Thus, in general, controlling for all the regressors in Table 2, there tend to exist other, unobserved characteristics which make individuals systematically choose healthy lifestyles and have good health, or vice versa.The most notable exception to this pattern in Table 2 is found between non-smoking and non-obesity, where the correlation coefficient is negative, at -0.113.However, this particular result is not surprising, as there is evidence to suggest that smoking is associated with lower body weight through affecting ones appetite and metabolic rate (Chiolero et al., 2008).The correlation matrix in Table 2 suggests that the strongest correlation of unobserved individual characteristics exists between nonobese and SAH (0.296), i.e., between our two health variables, while physical activity is the lifestyle variable which is most closely associated with these two health measures, again in terms of unobserved characteristics.While the correlation structure for unobserved characteristics between the different lifestyle variables and non-obesity is mixed, healthy lifestyles and good SAH are always positively related, and significantly so.

Decomposing total inequality and socioeconomic inequality in lifestyles and health
We turn next to inequality in lifestyles and health as measured by Gini and Concentration indices.The results of interest are presented in Table 3.The row "Gini (G)/CI educ /CI inc " report the actual index estimates, calculated according to Eq. ( 3), for respectively total inequality, education-related inequality and income-related inequality in lifestyles and health.The remaining rows in Table 3 report results from the corresponding decomposition analyses (Eq.2), that is, these rows indicate the percentage contribution of each regressor, or group of regressors (in bold), to the Gini, CI educ and CI inc for each lifestyle and health variable. 1010 We follow the procedure of Balia and Jones (2008); since all our outcome variables are binary, we base our Gini calculations on predicted probabilities rather than observed outcomes, with individual predictions being based on results of the multivariate probit model in Table 2.This procedure ensures that we get sufficient variability in the outcome variables for which to calculate the Ginis, but this comes at a cost; predicted probabilities are additive in the regressors, and thus only the deterministic part of the decomposition equation (Eq.2) can be calculated (Balia and Jones, 2008).Thus, although the percentage contributions per column in Table 3 sum to one hundred (summing groups of regressors, in bold), this only reflects the explained part of total inequality.Similar to econometric models for individual behavior and health, the unexplained residuals of variation, or here, the unexplained residuals of inequality, are typically large, and this must be taken into account when reviewing these results.For consistency, we also calculated the education-and income-related CIs in Table 3 using predicted probabilities, although in principle, these could be calculated using observed outcomes.
-9 - The results in Table 3 suggest that decomposition-of-inequality analyses are very sensitive to which 'type of inequality' is being studied.While perhaps not surprising, education itself is generally a very dominant contributor to education-related CIs in lifestyles and health (mean contribution: 67.9 %), while income is similarly a dominant contributor to income-related CIs (mean contribution: 49.6 %).In contrast, if we instead focus on sources of total inequality, education and income become less important, as they explain on average 18.4 % and 10.0 % of the Ginis in lifestyles and health.
Another example of the sensitivity issue is the role of gender in fruits and vegetables, which explains as much as 47.8 % of total inequality, but only 1.5 % of CI educ , which is mainly due the CI of education with respect to gender being close to zero (the second component of Eq. 2).Thus, at least in our sample, a study focusing on socioeconomic inequality in fruits and vegetables eating would probably miss out that the key target group for eating more fruits and vegetables is actually males, and not low income or education groups (although these groups are also important).As we believe that factors other than socioeconomic status are also important elements of inequality in lifestyles and health (Fleurbaey and Schokkaert 2009), we will in the following focus mainly on sources of total inequality in these variables, i.e., on decompositions of the Gini indices in Table 3.
The key contributors to total inequality are not the same across our different lifestyle and health variables.As indicated, the key contributor in fruits and vegetables is gender (47.8 %), while in fish eating, age is clearly most important (64.8 %).Education is the key factor in explaining population differences in physical activity (27.1 %) and in particular nonsmoking (41.2 %), while social security status explains as much as 31.9 % of the Gini in SAH, which reflects the strong association that was found between these variables in the multivariate probit model in Table 2. Thus, social security status is the most important factor in explaining inequality in SAH, despite its relatively low mean (0.082), which, ceteris paribus, should reduce its impact on total inequality (see Eq. 2).Interestingly, childhood conditions seem to be particularly important in explaining population differences in body mass, with maternal education being the single most important contributor to the Gini in nonobesity (20.9 %).
The finding in the multivariate probit model in Table 2 of education being relatively more important than income in predicting lifestyles, and of income being relatively more important than education in predicting health, is partly reflected in the contribution of these two indicators of socioeconomic status to total inequality; on average, education and income explain respectively 22.8 % and 6.8 % of the Ginis in lifestyles, while the corresponding figures for SAH are 7.7 % and 21.1 %.Education and income make identical contributions to the Gini in non-obesity (11.6 % each), which is different from results of the multivariate probit model in Table 2, where income was found to be more important than education.Of course, the finding of education having a greater impact on total inequality in lifestyles than SAH does not imply that there are greater educational differences in lifestyles than SAH.What it does imply is that, (i) while both lifestyles and SAH are strongly correlated with education, and (ii) education is strongly correlated with many of the other control variables in Table 3, (iii) these other control variables are more directly associated with SAH than with the different lifestyles, which means that the direct contribution of education itself to indices of total inequality and education-related inequality is more 'attenuated' in SAH than in lifestyles.

Conclusions
The main purpose of this paper has been to compare sources and patterns of inequality in important health affecting lifestyle choices, on the one hand, with those in final health, on the other.The motivation for this assessment, which uses Norwegian data, has been that health inequalities should preferably be prevented rather than treated, and thus, policies must mainly target production factors of health, including lifestyles, and not final health itself.However, the existing literature has mainly focused on sources and patterns of inequality in final health.For policy purposes, knowledge about these patterns in final health is mainly relevant to the extent that they are representative of patterns of inequality in important, underlying production factors of health.
As is standard, we find that there are clear income and education gradients in most of our lifestyle and health variables, which are (i) physical activity, non-smoking, and eating fish and fruits and vegetables, and (ii) obesity status and self assessed health.However, in a multivariate probit model that controls for basic demographics, occupational status, psychological traits, and childhood circumstances, there is considerable variation across the different lifestyle and health variables with respect to the steepness and the statistical significance of these socioeconomic gradients.
Using decomposition techniques for the Gini index of total inequality in lifestyles and health, we find that there is considerable variation across these variables with respect to sources and patterns of inequality.While education is generally an important source of inequality, and in some variables -primarily the health variables -also income, there are several cases in which other factors are much more important in explaining inequality, such as gender in fruits and vegetables eating, age in fish eating, and maternal education in obesity.
Our main conclusion is therefore that patterns of inequality in different lifestyle and health variables are heterogeneous, and thus, patterns of inequality in health variables are not necessarily representative and relevant for patterns of inequality in their underlying production factors, including lifestyles.While one could argue that population differences in lifestyles and health by socioeconomic status are particularly problematic and unjust, and thus rightfully achieves almost all attention in the literature on health inequalities, we agree with Fleurbaey and Schokkaert (2009) in that more attention should also be given to other, perhaps more important sources of inequality, such as age and gender in the examples above, rather than simply be labeling these as 'acceptable' or 'unavoidable' sources of inequality.For example, it is clearly possible to avoid having gender differences in fruits and vegetable eating, and in fact, achieving it is a highly workable policy goal.
Notes: This multivariate probit model was estimated using the Stata module mvprobit (with 90 draws).Sample weights were applied.Marginal probabilities in bold, bold italics and italics are statistically significant at the 99 %, 95 % and 90 % levels, respectively.The marginal probabilities represent average partial effects (see Footnote 8 for details).See Table1for definitions of relevant reference categories.

Table 3 .
Decomposing Gini indices of total inequality and CIs of socioeconomic inequality in lifestyles and health -percentage contributions