Electronic copy available at: http://ssrn.com/abstract=1397695 Double coverage and demand for health care: Evidence from quantile regression Sara Moreira Banco de Portugal, ISEG-TULisbon Pedro Pita Barros Universidade Nova de Lisboa May 1, 2009 Abstract An individual experiences double coverage when he bene�ts from more than one health insurance plan at the same time. This paper examines the impact of such supplementary insurance on the demand for health care services. Its novelty is that within the context of count data modelling and without imposing restrictive parametric assumptions, the analysis is carried out for di�erent points of the conditional distribution, not only for its mean location. Results indicate that moral hazard is present across the whole outcome distribution for both public and private second layers of health insurance coverage but with greater magnitude in the latter group. By looking at di�erent points we unveil that stronger double coverage e�ects are smaller for high levels of usage. We use data for Portugal, taking advantage of particular features of the public and private protection schemes on top of the statutory National Health Service. By exploring the last Portuguese Health Survey, we were able to evaluate their impacts on the consumption of doctor visits. Keywords: Demand for health services, Moral hazard, Count data, Quantile regression. JEL codes: I11, I18, C21, C25 Corresponding author: Sara Moreira. E-mail: spmoreira@bportugal.pt. Address: Economic Research Depart- ment, Banco de Portugal, Av. Almirante Reis, 71, 6o, 1150-012 Lisboa, Portugal. Electronic copy available at: http://ssrn.com/abstract=1397695 1 Introduction The aim of this paper is to analyse the impact of health insurance double coverage (i.e. a situation in which an individual is covered by more than one health insurance plan)1 on the consumption of health care. It is well known that if the demand for health care reacts to budget constraints and preferences changes then, double coverage should also have important e�ects because it modi�es the actual price of services, the income of the insured, and the opportunity cost of time in the case of illnesses. The e�ect of supplementary health insurance is often associated to an aggravation of moral hazard that creates incentives for people to go to the doctor more frequently and eventually because of less severe illness.2 Organizational designs of health systems may generate layers of coverage. The most common situation regards the case where an individual bene�t from a compulsory public insurance, and in addition he has purchased a private one. Such supplementary private health insurance usually overlaps the range of health care services provided by the statutory health system. The main purpose of second (and higher) layer of coverage is usually to increase the set of choices about the health care provider (for example, private providers or private facilities in public institutions) as well as to decrease the level of co-payments done by the individual. By increasing the choices of provider, patients may also obtain a faster access to health care. Quantitatively, double coverage is not a negligible phenomenon. It can be found in all European countries, being common in Finland, Greece, Portugal, Spain, Sweden and in the United Kingdom. Furthermore, in the United States, the Obama plan is expected to increase health coverage, inclusively by allowing Americans to maintain their current insurance scheme while accessing new options. In such scenario, double coverage situations are expected to augment signi�cantly in coming years. Research on this phenomenon can help to detect whether possible ine� ciencies, causing unnecessary and costly utilization due to moral hazard, should be a concern. Existing works addressing health insurance double coverage focus on mean e�ects. In con- trast, by looking at other points of the conditional distribution we unveil that stronger e�ects 1The terms "duplicate coverage", "supplementary health insurance" or "additional health insurance" are used alternatively in the literature. 2Moral hazard in this context is de�ned as the "change in health behaviour and health care consumption caused by insurance" (Zweifel and Manning 2000). Some authors criticize the direct association of double coverage with moral hazard, arguing on the existence of other important e�ects. For instance, Vera-Hern�ndez (1999) refers the impact of the insurance on the health status of the individual, which will decrease the future consumption of health care. Also Coulson et al. (1995) points to existence of supply-inducement by providers. 1 are found for less frequent users. Our �ndings are the result of the application of an innovative technique for estimating the quantile regression for counts. The estimates were computed with Portuguese data3, using as source of double coverage the existing health insurance schemes be- yond the National Health Service (NHS). Approximately a quarter of the Portuguese population has access to a second (or more) layer of health insurance coverage on top of the NHS, through mandatory (occupation-based) health subsystems for workers of some large companies and public employees and voluntary health schemes. We focus our attention on the double coverage resulting from the former type, regarding both health insurance plans provided to public employees and insurance plans of private companies. Results indicate that double coverage is especially high in the private subsystems (2.6 to 2.9 times higher than the one presented by public employees). An interesting �nding, which could only be observed through the use of quantile analysis, is that these e�ects are lower in the upper tail of the outcome distribution. This shows that health insurance double coverage is relatively more relevant for the �rst levels of usage since for more frequent users the consumption behaviour depends less on the health insurance plan. We measure health care demand through the number of doctor visits during three months. As in most of the research on health care, the dependent variable is a non-negative integer count characterized by a large proportion of zeros, a positive skewness and, as a consequence, a long right hand tail. In what concerns to the econometric tools, until recently, the one-part, Hurdle and �nite mixture models have dominated the empirical literature (Deb and Trivedi 2002). Estimators resulting from these frameworks rely on assumptions about the functional form of the regression equation and the distribution of the error term. As a result, standard models determine entirely the distributional behaviour by the functional form once the conditional mean response is known. An attractive alternative is the usage of nonparametric and semiparametric estimators. Introduced for continuous data in Koenker and Bassett (1978), Quantile Regression o�ers a complete picture of the e�ect of the covariates on the location, scale and shape of the distribution of the dependent variable. As a semiparametric method it assumes a parametric speci�cation for the quantile of the conditional distribution but leaves the error term unspeci�ed. It was �rst applied to continuous health data in Manning et al. (1995). As in Winkelmann (2006) and Liu (2007), we apply an approach suggested by Machado and Santos-Silva (2005) in 3In particular, the Portuguese Health Survey of 2005/2006, a cross sectional health dataset that provides a wide range of information at an individual level concerning socioeconomic conditions and health status indicators. 2 which quantile regression is extended to count data through a "jittering" process that arti�cially imposes some degree of smoothness. This technique allows an analysis of the e�ect on the whole consumption distribution, which is an important step forward in the analysis of reforms and is very useful for policy making. In particular, it may help the policy maker to understand why people with similar health conditions di�er in their use of medical care, since it enables to determine whether the policy e�ect is larger among low users or among high users, or may even signal the need for adjustments on the characteristics of the contracts provided by the insurances companies. This kind of information is important to control the expenditures in health care as well as to assess the equity of the system. Many authors have been investigating the impact of additional health coverage in order to estimate the moral hazard derived from di�erent health insurance plans characterized by di�erent levels of coverage (for example Cameron et al. 1988, Coulson et al. 1995, Vera-Hern�ndez 1999, Louren�o 2007 and Barros et al. 2008). The usage of non-experimental data generally creates an endogeneity problem related to adverse selection since most of the times the decision to buy extra health insurance depends on individual characteristics. In such cases, the insurance parameter does not disentangle moral hazard and adverse selection e�ects. The solution relies most of the times on �nding reasonable instrumental variables. Our empirical application does not have this problem because the membership on public and private health subsystems was mandatory and based on professional category, meaning that they were unrelated to the expected value of future health care consumption. Note that we are excluding from the analysis the voluntary health insurance plans. The paper is structured as follows. Section 2 describes the Portuguese health care system from a provision perspective. Section 3 describes the dataset and the relevant variables, and presents an exploratory analysis of the data. In Section 4 we present the quantile regression for counts and discuss the treatment e�ect speci�cation. In Section 5 we analyse the results and �nally, Section 6 presents the �nal remarks. 3 2 Portuguese health care system: an overview from the provi- sion perspective The Portuguese health system is a network of public and private health care providers and di�erent funding schemes.4 It is possible to identify three overlapping layers: the National Health Service (NHS)5, mandatory public and private subsystems and private voluntary health insurance. While the NHS is mainly �nanced by general taxation, subsystems resources come from employees and employers compulsory contributions (including, in the public schemes, State funds to ensure their balance). According to Barros and Sim�es (2007), in 2004 public funding represented 71.2 per cent of total health expenditure (of which 57.6 per cent is related with the NHS and 7.0 percent cent with subsidies to public subsystems). Private expenditure is composed by co-payments and direct payments made by patients and, to a lesser extent, by private insurance premiums. Since 1979, with the creation of the NHS, legislation established that all residents have the right to health protection regardless of economic or social status. Until then, the State had full responsibility only for the health care of public employees and some speci�c types of services, as maternity, child and mental care and the control of infectious diseases. One of the features of the period preceding the outset of the NHS that persisted was the existence of public health subsystems, partially because trade unions, which managed some of those subsystems, were not willing to give up their privileges and forcefully defended their maintenance on behalf of their members (Barros and Sim�es 2007). The individuals covered solely by the NHS (the majority of the population) face some con- straints in the access to public providers, in particular because of services excluded from the public network and di� culties of access due to time costs (long waiting lists and queuing) or geographical barriers. Louren�o (2007) among others, argues that the NHS coverage restrictions convert its normative completeness into an incomplete health insurance contract. The NHS is conceived in a way that bene�ciaries should �rst seek health care through their general prac- 4This section is mostly based on Barros and Sim�es (2007) and Louren�o (2007). An interesting comparison between the Portuguese health system and other European systems is available in Bago-d�Uva and Jones (2008). 5In the autonomous regions, the public health is ensured by regional health services (RHS of Azores and Madeira) following the same principles of the NHS but implemented by regional governments. Here it is not worth to distinguish between them. 4 titioner (family doctor) in health care centers and then, if necessary, get appropriate referrals to a public specialist consultation (generally as out-patient consultations in public hospitals). This gatekeeper procedure is not strictly followed since there are households who do not have access to a family doctor and, when they have, the time lag between the �rst step to obtain health care and its e�ective provision is frequently too long. Additionally, the requirements to obtain referrals are generally very demanding. For these reasons, some individuals have their �rst contact with health care in hospitals�emergency rooms even if their condition would not require it. Given this constraints, the consumption of private services by NHS bene�ciaries6 is very common. The NHS design contemplates a cost-share mechanism that in practice makes the patients pay a mandatory small co-payment to the public provider (variable with the type of service), usually on a fee-for-service basis. There are, however, exemptions for a large share of the population de�ned on the basis if the age and income distribution. When using health services provided by the private sector, NHS bene�ciaries, in the absence of private voluntary insurance schemes, support their full cost, having no reimbursement for it.7 People who bene�ts from additional health care schemes, either mandatory or voluntary, do not see their taxation a�ected, and as a consequence they are still eligible to receive health care from the NHS. Nowadays, a considerable share of the population (between 20-25 per cent of overall popu- lation) still bene�ts from occupation-based health insurance through several subsystems, either private or public. Among the double coverage�schemes, the largest public subsystem is ADSE (Direc��o-Geral de Protec��o Social aos Funcion�rios e Agentes da Administra��o P�blica), a Government department acting as a health insurance provider, covering public employees (about 15 per cent of the population). Exceptions enjoying speci�c schemes also exist, like the mil- itary personnel. Private subsystems were created to workers and pensioners (and respective households) of private companies that have their own insurance schemes, like SAMS (Servi�os de Assist�ncia M�dico-Social) for banking employees. Each subsystem has a distinct array of medical care insurance arrangements to �nance and provide health care. As a whole, we can say that they are organized di�erently from the NHS, in particular because of the lower proportion of services directly provided. They basically provide health care through contracts with pub- 6In the course of the paper, when it says "NHS bene�ciaries", we consider individuals covered solely by NHS. Therefore, this de�nition excludes the population with double coverage. 7The system allows, however, the recovery of some out-of-pocket outlays because both patient co-payments and costs of private services are tax-deductible. 5 lic/NHS and private institutions and reimburse patients costs for services supplied by private entities without contract. These features make these schemes more comprehensive health protec- tion plans than NHS, representing both complementary and supplementary types of insurance (Louren�o 2007). The supplementary protection results from the provision/�nancing of services that are also available in the context of the NHS. This particular feature creates the double cov- erage problem. The complementarity characteristic is relevant due to the fact that subsystems cover services almost not provided by the general system, in particular, by reimbursing part of patients costs in private providers (even the ones without contracts). 3 Data 3.1 Dataset Data was taken from the fourth Portuguese Health Survey (PHS), a cross sectional health dataset designed to be representative of the Portuguese population that lives in households.8 It provides a wide range of information at an individual level, namely demographic and socioeconomic conditions, type of health insurance, health-care utilization, health status indicators (like chronic diseases and long run and short run disability), lifestyles (like alimentation habits and sports activity) and costs with health services. However, some of the questions were only answered by part of the sample. The survey was collected by interviews carried out between February 2005 and January 2006. The PHS sample re�ects the geographical structure of the population according to the 2001 census, resulting from a two-stage cluster sampling that followed a complex design involving both strati�cation and systematic selection of clusters.9 A total of 19,950 households units were selected for the survey. In each household all individuals were face-to-face interviewed. The sample used in this paper comprises 35,308 observations and was obtained after treating the data and imposing some constraints. Firstly, we excluded 158 observations of individuals that did not report the number of visits to a doctor, 10 observations without answer regarding the subsystem they belong to and 3,114 observations of persons with voluntary private health 8The PHS are carried out by the Portuguese Ministry of Health in collaboration with National Health Institute Ricardo Jorge and National Statistics Institute. Until now, four questionnaires have been made (1987, 1995/1996, 1998/1999 and 2005/2006) using probabilistic samples of the continental population (1st. 2nd and 3th PHS) and of both continental and autonomous regions of Azores and Madeira population (4th PHS). Here we made use of the last available questionnaire. Note that it is not a panel survey since the sample changes between surveys. 9In "Methodological Note of Portuguese Health Survey 2005-2006". 6 insurance. Secondly, we dropped 145 observations of pregnant women whose visits to the doctor were related to their condition. Finally, we deleted 1,047 observations with missing values for any other relevant variables (according to the set of regressors chosen). Three points should be made to the latter choices. Firstly, the simplest way of handling missing data is to delete them and analyze only the sample of "complete observations" (although deleting observations reduces the e� ciency of the estimation). This procedure is named as listwise deletion. Its usage is statistically appropriate only if the missing values are missing completely at random (Cameron and Trivedi 2005), which means that the probability of missing does not depend of its own value nor on the values of other variables in the data set (the observed sample is a random subsample of the potential full sample). Among the relevant questions of our dataset that can create a sample selection problem, the one that generated more missing observations concerns the income level. However, most of the missing (around seventy per cent) does not result from a non answer but from individuals that declare not knowing the household income, which if not deliberately makes unlikely that unobserved factors in�uenced both the decision to respond and the value of the dependent variable. Secondly, the exclusion of voluntary health insurance individuals can be pointed as a short- coming. The problem is that including such variable may introduce endogeneity problems, dif- �cult to eliminate since there are no suitable instrumental variables (Barros et al. 2008). In this context and given the relatively small number of insured individuals (7.6 per cent) it seems better to exclude such observations and restrict the analysis to the population exclusively insured through mandatory schemes. Finally, another important feature that is worth noting is that the database has weight variables (natural in a sample created to be representative of the population). It is possible to ignore them without a�ecting the parameter estimates (Wooldridge 2002, Cameron and Trivedi 2005). This is more likely, when sampling weights are solely a function of independent variables, or when the model can be respeci�ed (including new variables or interactions). Otherwise, parameters estimated would be biased. The problem with the use of a weighted dataset is that it leads to arti�cially small standard errors for regression coe� cients and therefore incorrect inferences on the signi�cance of the di�erent e�ects. We chose to exclude the weights from the analysis by including in our regression the variables under the sample design of PHS. 7 3.2 The variables To capture health care utilization we use the total number of visits to doctors in the three months prior to the interview. The question in the survey was: "How many times did you visit a physician in the last three months?". The survey includes a question about the type of doctor (general practitioners or specialist) of the last visit which does not allow to disentangle all the visits taken in the period of three months. Therefore, one limitation of this measure of demand for health care is that it encompasses consultations to general practitioners and specialist doctors, as well as emergency episodes. Another lack of information is related to the nature of the provider of the consultations, in particular because it is not possible to identify if they are public or private (with or without contract). Table 1 presents the �nal covariates used in our analysis clustered into groups encompassing health insurance status, socioeconomic characteristics, and health status. In addition, two further groups were also included to control for geographic and seasonal e�ects. We selected them among the raw data available in the database10 according to their in�uence on medical care consumption, taking into account the Grossman�s health capital model of demand for health (1972) as well as the results of similar empirical studies (Cameron et al. 1988, Pohlmeier and Ulrich 1995, Vera-Hern�ndez 1999, Deb and Trivedi 2002 and Louren�o 2007). Grossman (1972) constructed a model in which health demand results from an investment on durable capital stock in order to produce future healthy time. According to microeconomic theory, the main factors taken into account in the estimation of a demand curve should be the budget constraint and individual preferences. Although economists have di� culties in understanding consumer incentives for health care, it is possible to �nd several channels through which the selected variables a�ect the number of doctor visits. The problem is that the quantity of visits is only partially a result of consumer incentives because the doctors play an important role in medical choices. Depending on the kind of patient, we can have extreme cases of complete delegation of decisionmaking to the doctor. For this reason moral hazard e�ects are also relevant on the doctors side (demand inducement). 10Some information was excluded from the analysis, particularly the questions reported only by part of the sample according to the week of the interview. 8 Table 1: Description of the variables Variables Description Type Health insurance status variables pubsub =1 if the individual is covered by a public subsystem dummy privsub =1 if the individual is covered by a private subsystem dummy Health status variables sick =1 if the individual is being sick dummy limitdays number of days with temporary (not long run) incapacity count limited =1 if the individual is limited/handicaped dummy rheumatism =1 if the individual has rheumatism dummy osteoporosis =1 if the individual has osteoporosis dummy cancer =1 if the individual has cancer dummy kidneystones =1 if the individual has kidneystones dummy renalfailure =1 if the individual has renalfailure dummy emphysema =1 if the individual has emphysema dummy cerebralhaemorrhage =1 if the individual had a cerebral haemorrhage dummy infarction =1 if the individual had an infarction dummy depressivedisorder =1 if the individual has a depressive disorder dummy otherchronicaldisease =1 if the individual has another chronical disease dummy highbloodpressure =1 if the individual has high blood pressure dummy chronicpain =1 if the individual has a chronic pain dummy diabetes =1 if the individual has diabetes dummy asthma =1 if the individual has asthma dummy stress =1 if the individual has been taking sleeping pillsor anxiety pills in the last two weeks dummy smoker =1 if the individual smokes daily count meals =1 if the individual makes at least three meals a day dummy Socioeconomic and demographic variables householdsize householdsize of the individual count age age in years continuous female =1 if the individual is female dummy educmax number of years of schooling completed with successof the most educated person living in the household count lincome logarithm of equivalent monthly income in euros continuous single =1 if the individual is single and not cohabits dummy student =1 if the individual is student or has it �st jobor has a not remunerated job dummy retired =1 if the individual is retired dummy Geographic variables Norte =1 if the individual lives in the region "Norte" (NUTS II) dummy Lisboa =1 if the individual lives in the region "Lisboa" (NUTS II) dummy Alentejo =1 if the individual lives in the region "Alentejo" (NUTS II) dummy Algarve =1 if the individual lives in the region "Algarve" (NUTS II) dummy A�ores =1 if the individual lives in the region "A�ores" (NUTS II) dummy Madeira =1 if the individual lives in the region "Madeira" (NUTS II) dummy Seasonal variables winter quarter in which the interview took place dummy spring quarter in which the interview took place dummy Summer quarter in which the interview took place dummy 9 The underlying health status and the socioeconomic characteristics play a major role in the preferences formation. Health status also in�uences the constraints limiting the pursuit of preferences since illness events usually imply a loss of income (although sometimes partially o�set by sickness bene�ts). In the PHS, health status is only indirectly captured through some questions that re�ect details about current medical conditions (e.g. sickness episodes and limited days) and the presence of chronic diseases or pains (e.g. rheumatism, cancer and diabetes). Besides including such variables, the consumption of sleeping and anxiety pills is used as a proxy to the level of exposure to stress, as well as some other regressors related to attitudes with a potential impact on health, like the number of meals and a dummy variable reporting a smoker/non-smoker individual. Despite being crude measures, these last regressors allow to capture some remaining health aspects and some unobserved in�uences.11;12 The variables representing demographic and socioeconomic features of the interviewed can in�uence simultaneously the decision to seek health care directly and indirectly through their impact on health care status. This is particularly evident when analysing the covariate age. According to Grossman (1972), age captures the depreciation of health capital which in�uences the health status and is an important factor in�uencing individual preferences. It is expected that the rate of depreciation increases as the individual gets older, at least after some point of the life cycle, making the healthy times decrease. As a consequence, the demand for health care is expected to increase over the life cycle. At the same time, age is an extra variable that can be considered as a health status proxy since older individuals are, on average, less healthy and less e� cient in producing health. We chose to control for age through a nonlinear relationship and by including variables that allow an assessment of its e�ect by gender type. Amongst the socioeconomic covariates, a gender dummy was included because it is believed to in�uence the rate at which the health stock depreciates and the e� ciency in producing healthy times. It is expected that health depends on biological di�erences between man and women 11Winkelmann (2004) and Winkelmann (2006) also include individual subjective self-assessment of health status. PHS provides that information (with the question "How well do you perceive your own health at the present time?", with responses "very good", "good", "fair", "poor" and "very poor") but we excluded its use. These variables are likely to create an endogeneity problem: the self-understanding of the health status in�uences the consumption of medical care but it is also in�uenced by consumption since the assessment is made after visiting the doctor. As suggested by Windweijer and Santos-Silva (1997), we control for this subjective health evaluation by including long-term determinants of health (smoking and eating habits). 12Engagement in sports activities is an alternative proxy for good health but was only available for a small part of the sample, which would imply a substantial decrease in the size of the sample. 10 through innate features, life styles and di�erent attitudes towards health risk. Accordingly, we also control for the marital status with the inclusion of the covariate single. Besides the arguments of di�erent life styles and attitudes toward risk, it is our understanding that some decisions when taken by more than one person bene�t from advice and more information, which should in�uence health status and e� ciency in producing healthy times.13 To control for educational level, it was de�ned a variable with the number of schooling years of the most educated person living in the household. It is expected that more educated people are more productive in the market as well as in the household, therefore even if they seek for more health they need less medical care. On the other hand, di�erent educational levels are associated with di�erent opportunity costs and attitudes towards risk. This particular indicator was chosen, as an alternative to the usual number of the schooling years of each individual, because we believe that the decision about the number of visits to a doctor is at least partially a decision of the household and again bene�ting from a better level of information. The variables student and retired capture occupational status which may explain some dif- ferences in the depreciation rate. It is expected that a person who does not work, presents lower opportunity costs (in terms both of time and income) of visiting a doctor, than an individual with a regular job. Further, since hours of market or non-market can have di�erent values and the stock of health determines the total amount of time to spend producing earnings and commodi- ties, it is expected that more active individuals invest more in health capital. These particular variables can capture some income and age e�ects (traditionally students are the youngest in the database and the retired the oldest). Another variable included in the model is the monthly equivalent income. In the dataset income is measured by a ordinal variable with ten thresholds that indicate the category of the disposable net household income in the month prior to the interview (including wages, pensions, and all sort of social security bene�ts). A common way to control for income e�ects is including in the model a set of dummy variables, one for each category. We chose to construct a monthly income variable following the adjustment proposed by Pereira (1995) by interpolating grouped 13Most of the studies include a slightly di�erent variable that assumes one if the person is married instead of single. The design of the survey and some previous results in�uenced the choice of this particular variable. 11 data, and in a second stage taking into account di�erences in household characteristics.14;15 According to Grossman (1972) there are reasons to believe that medical utilization increase with income: "The higher a person�s wage rate, the greater the value to him of an increase in healthy time". The idea is that the cost of being ill is higher. A converse argument is that the opportunity cost of going to the doctor is higher for higher wages. In addition to this, income also represents the ability to pay, as a proxy of wealth. The variables Norte, Lisboa, Alentejo, Algarve, A�ores and Madeira represent the region of residence and were included to control for possible behavioural di�erences in the demand and supply of health care services.16 The regions encompass wide areas but nevertheless, when we compare them in terms of wealth or educational indicators we obtain huge di�erences, which could justi�e di�erent behaviours on seeking for health care services (not totally captured at the individual level). Apart from this argument, the main reason to include these variables is because they proxy di�erent access to medical care supply, since some regional services are di�erently organized. Note that in the continent, the �ve regions correspond to the �ve regional health administrations, and in the autonomous regions there are two di�erent regional health services.17 To control for the period of the year in which the interview took place we included the regressors spring, summer, and winter (autumn being omitted). This is important because there may be some seasonal di�erences in individuals health status. Finally, we use the health insurance dummy variables to distinguish between control and treatment groups. In this case we have a control group "NHS" composed by individuals with only the default health system, and two di�erent treatments "Public subsystems" and "Private subsystems".18 We managed to do it by dividing the observations according to the type of health insurance, in particular by considering three mutually exclusive groups that is compared with 14To perform the interpolation of grouped data, we assumed that the midpoint of the interval at which the family belongs is the income of the household. It was necessary to assume a value of 2500 for the last, open ended, income bracket (we test the robustness to this value by considering other values). To make the normalization to account for the family characteristics we used the square root scale, through dividing the household income by the square root of household size. 15Note that it is not necessary neither to de�ate this variable nor to make it comparable across countries. 16In accordance with NUT II classi�cation (o� cial territorial nomenclature for statistical analysis), Portugal is divided into seven regions. The survey includes data for all of them. Therefore, we use six dummies. 17Louren�o (2007) used a dummy variable for a rural versus urban location that could not be included on the basis of the data from the fourth PHIS. The di�erence, however, is partially controlled for the region variables since they have di�erent proportions of rural and urban areas (e.g. Lisboa and Alentejo). 18Notice that each individual has only one kind of treatment. 12 the control group. according to their health care coverage: only the NHS, the NHS plus a public subsystem or the NHS plus a private subsystem.19 These variables are of particular importance since the main goal of this work is to assess how a patient�s use of medical consultations is a�ected by types of health insurance. From a theoretical point of view, insurance is a price proxy, therefore, these variables together with income determine the budget constraint. Note that the di�erences between health systems as regards costs to bene�ciaries (as co-payments and non- reimbursements) work as direct prices and mechanisms to control for its use, and delivery systems are indirect costs of access. When compared to the NHS, the subsystems provide more bene�ts to their bene�ciaries by decreasing the price-per-service faced by patients, which whenever demand is elastic, increases their health care demand (Barros et al. 2008).20 The estimation of this moral hazard e�ect is particularly di� cult in a context of adverse selection as it leads to endogeneity of the treatment variables and results in an overestimation of its impact. As noted by Barros et al. (2008) the exogeneity of both types of coverage removes the need for using instrumental-variables estimation (for more details see Section 4.3). 3.3 An exploratory analysis of the data Table 2 presents the empirical distribution of the dependent variable (y) and some statistics. As the table shows, the majority of observations are of the NHS group, followed by the public subsystem. The dependent variable used is a count variable (non-negative integer valued count y = 0;1;2;:::) with a large proportion of zeros (half of the sample) as well as a long right tail of individuals who make heavy use of health care. These features make the estimation particularly di� cult since it will be necessary to use �exible models that accommodate them. For the whole sample, the average number of consultations is 1.01 and the average number of visits to those that have at least one visit is 2.04. Moreover, the unconditional variance is more than three times the unconditional mean.21 When we analyse the average number of visits to a doctor by health insurance systems, it is possible to observe that private subsystems bene�ciaries are higher users than NHS and public subsystems groups. Indeed, a mean comparison t-test indicates that the unconditional probability does not di�er across NHS and public subsystems but di�er when one 19Despite having common features, both public and private groups include several subsystems. 20Some additional bias problems are related with the supply-induced demand by health care providers. 21This is a sign of possible overdispersion just con�rmed when a conditional analysis is made. 13 compares NHS with private subsystems. Table 3 presents the descriptive statistics of the explanatory variables by health insurance type. The mean comparison t-test indicates that most of the di�erences between the three types are signi�cant, specially when one looks to socioeconomic pre-determined variables. The NHS group has relatively less years of education and less income. On its turn, public subsystems bene�ciaries are younger (on average about 4 years less than the other groups), have a greater proportion of students and singles and a smaller share of retired persons. The private subsystems group has less women and a smaller household size. As regards the health status distributions of the three groups, it is possible to conclude that the major di�erences are found between the public subsystem and the NHS. The public employees seem to be the healthier, in particular when we analyse some variables related to physical limitations (limited days and limited) and to the presence of chronic diseases and pains. Table 2: Empirical distribution of the dependent variable TOTAL NHS Public sub. Private sub. y relative frequency 0 50.31 50.88 48.82 41.91 1 26.94 26.53 28.54 29.83 2 10.78 10.61 11.37 12.61 3 6.77 6.82 6.15 8.72 4 1.99 2.02 1.69 2.84 5 1.12 1.06 1.25 2.10 6 0.98 0.95 1.17 0.95 7 0.20 0.21 0.14 0.21 8 0.22 0.23 0.18 0.42 9 0.08 0.07 0.13 0.11 10 0.19 0.17 0.30 0.11 11-15 0.25 0.28 0.15 0.11 16-20 0.04 0.04 0.06 0.11 21-25 0.06 0.06 0.06 0.00 26-30 0.06 0.07 0.02 0.00 Observations 35,308 28,778 5,578 952 100% 81.5% 15.8% 2.7% Mean 1.01 1.01 1.01 1.19 Standard deviation 1.77 1.80 1.64 1.61 P-value (Ho: yNHS = ysubsystem) - - 0.998 0.000 14 Table 3: Descriptive statistics by health insurance system NHS Public subsystem Private subsystem mean st.dev mean st.dev p-value mean st.dev p-value Health status variables sick 0.007 0.001 0.005 0.001 0.008 0.005 0.002 0.363 limitdays 0.613 0.015 0.488 0.030 0.000 0.536 0.077 0.327 limited 0.016 0.001 0.004 0.001 0.000 0.006 0.003 0.000 rheumatism 0.168 0.002 0.120 0.004 0.000 0.134 0.011 0.003 osteoporosis 0.069 0.001 0.060 0.003 0.014 0.068 0.008 0.943 cancer 0.019 0.001 0.020 0.002 0.688 0.022 0.005 0.491 kidneystones 0.048 0.001 0.051 0.003 0.473 0.058 0.008 0.224 renalfailure 0.014 0.001 0.011 0.001 0.196 0.014 0.004 0.971 emphysema 0.034 0.001 0.022 0.002 0.000 0.022 0.005 0.015 cerebralhaemorrhage 0.018 0.001 0.013 0.002 0.000 0.020 0.005 0.654 infarction 0.014 0.001 0.011 0.001 0.103 0.014 0.004 0.956 depressivedisorder 0.074 0.002 0.074 0.004 0.934 0.082 0.009 0.395 otherchronicaldisease 0.319 0.003 0.297 0.006 0.001 0.317 0.015 0.928 highbloodpressure 0.221 0.002 0.178 0.005 0.000 0.222 0.013 0.977 chronicpain 0.148 0.002 0.110 0.004 0.000 0.119 0.010 0.006 diabetes 0.077 0.002 0.054 0.003 0.000 0.074 0.008 0.651 asthma 0.051 0.001 0.057 0.003 0.075 0.049 0.007 0.837 stress 0.119 0.002 0.104 0.004 0.001 0.124 0.011 0.631 smoker 0.162 0.002 0.138 0.005 0.000 0.179 0.012 0.200 meals 0.926 0.002 0.949 0.003 0.000 0.933 0.008 0.402 Socioeconomic and demographic variables householdsize 3.387 0.009 3.342 0.017 0.020 3.100 0.037 0.000 age 42.044 0.131 38.984 0.285 0.000 42.946 0.685 0.196 female 0.515 0.003 0.537 0.007 0.003 0.419 0.016 0.000 educmax 8.112 0.026 11.949 0.061 0.000 11.625 0.147 0.000 lincome 6.048 0.003 6.624 0.007 0.000 6.669 0.019 0.000 single 0.350 0.003 0.391 0.007 0.000 0.322 0.015 0.076 student 0.164 0.002 0.247 0.006 0.000 0.188 0.013 0.065 retired 0.185 0.002 0.171 0.005 0.012 0.256 0.014 0.000 Note: The p-value indicates the probability of the mean of each variable does not signi�cantly di�er across insurance types. The test is performed as a two-sample mean-comparison test (unpaired). For the comparison between the NHS and the Public subsystem we considered H0: yNHS = yPublic subsystem; and for the comparison between the NHS and the Private subsystem we considered H0: yNHS = yPrivate subsystem. Geographic and seasonal statistics (and p-values) not reported. Available from the authors upon request. Moreover, frequent health problems (e.g. high blood pressure, diabetes and stress) are rel- atively more common in the NHS and private subsystem groups. This feature can be partially related with the age, which is lower among the public subsystems group. Additionally, it is worth highlighting that public employees seem to be less exposed to stress and that the indicators re- lated to attitudes show a smaller proportion of smokers and a higher average number of meals. 15 The regional distribution of the groups is also unequal in the full sample: most of the NHS individuals are located in the North; the public employees are concentrated in Lisbon, Alentejo and Azores; and the private subsystem group has relatively more bene�ciaries in the regions of Lisbon and Algarve. These sample di�erences suggest that a more complete account for them is required, so that an appropriate comparison of health care demand across groups can be made. 4 Econometric framework Econometrics of count data has its own modelling strategies in which discreteness and non- negativity are taken into account. Moreover, in the count world it is common that features other than location depend on the covariates, making the estimation of the conditional expectation poorer in the sense that provides very little information about the impact of the regressors on the outcome of interest. In this context it is potentially interesting to study the e�ect of regressors not only on the mean but also on single outcomes and in the full distribution. Within the vast literature on count data it is possible to �nd two general categories of methods that allow a complete description of the conditional distribution of a count outcome. Following the early work of Hausman, Hall, and Griliches (1984), several fully parametric probabilistic models, like Poisson and negative binomial regressions, have been developed in order to describe the e�ect of the covariates on di�erent points of a count variable. These regressions allow infer- ences for all possible aspects of the outcome variable (including the computation of the marginal probability e�ects). However, to do it, they impose restrictive parametric assumptions on the way the independent variables a�ect the outcome variable. As a consequence, this approach usu- ally face a lack of robustness, even when �exible models like the hurdle or latent class models are applied. Given these limitations, it can be attractive to use non- or semiparametric techniques that freely approximate the conditional distribution. This can be achieved with the estimation of conditional quantile functions, a technique that has been applied in the context of continuous regression for a long time (Koenker and Bassett 1978). Following the contributions of Manski (1975), Manski (1985) and Horowitz (1992) regarding binary models, some e�ort is being made to extend the method to discrete data. Recently, the seminal work of Machado and Santos-Silva 16 (2005) succeeded in applying the quantile framework to count data models. Since our main aim is to assess the e�ect of the subsystems on di�erent parts of the outcome distribution with- out imposing a probabilistic structure, the "Quantile for counts" regression model is a natural choice.22 4.1 Quantile regression for counts Let y be a count random variable and their -quantile de�ned as: Qy( ) = min[jP(y  )  ] where 0 < < 1 (1) The -quantile has the same discrete support as y and cannot be a continuous function of the covariates (x). Machado and Santos-Silva (2005) suggested a procedure known as �jittering�to arti�cially impose some degree of smoothness. The basic idea is to build a continuous auxiliary variable (y) whose quantiles have a one-to-one known relationship with the quantiles of the count variable of interest. The y is obtained by adding to the count variable a uniform random variable independent of y and x:23 y = y + u where u  uniform[0;1) (2) The continuity problem of the dependant variable is solved but the derivatives are not contin- uous for integer values of y. Machado and Santos-Silva (2005) proved that given some regularity conditions, valid asymptotic inference is possible. Among those conditions, it is particularly rel- evant the existence of at least one continuously distributed covariate. The standard quantile regression is applied to a monotonic transformation of y that ensures that the estimated quan- tiles are non-negative and the transformation is linear in the parameters of a vector of regressors. In order to implement the procedures, the authors suggest the following parametric repre- 22In order to better understand its advantages (and disadvantages), Moreira (2008) compares the implications drawn from the quantile regression approach with those from parametric count data models that have been used quite extensively in the analysis of health care. 23Machado and Santos-Silva (2005) showed that there is a little loss of generality in assuming that U is uniform. In fact they argue that it is possible to choose another distribution for U as long as it has a support on [0;1) and a density function bounded away from 0. The advantages of using a uniform distribution are purely algebraic and computational. 17 sentation of the -quantile of y: Qy( jx) = + expx0 ( ); 0 < < 1: (3) The reason for adding to the right side is that y is bounded from below at due to the way it is constructed. The exponential form is traditionally assumed in count data models. We believe that this speci�cation provides a good parsimonious approximation to the unknown conditional quantile functions. The linear transformation is speci�ed as: QT(y; )( jx) = x0 ( ); (4) where T(y; ) = f log(y ) for y> log(") for y ; being " a small positive number (0 < " < ).24 This is feasible because quantiles are equivariant to monotonic transformations and to cen- soring from below up to the quantile of interest. The vector of covariates ( ) is obtained as a solution to a standard quantile regression of a linear transformed variable by minimizing an asymmetrically weighted sum of absolute errors min nX i=1  T(y; ) x0i  where  (v) = v[ I(v < 0)]: (5) Machado and Santos-Silva (2005) proved that although the quantile regression is not di�er- entiable everywhere, the estimator is consistent and asymptotically normal: pnhb ( ) ( )i D !N(0;D1AD1) (6) with A = (1 )E(XX0) and D = E [fT(X0 ( )jX)X0X], where fT denotes the conditional density of T(y; ) given X. Because "noise" has been arti�cially created for technical reasons, Machado and Santos-Silva (2005) suggest a Monte Carlo procedure - an "average-jittering" - which consists in obtaining an estimator that is the average of m independent "jittering" samples with the same size. The di�erence between samples is the dependent variable y because it is created as the sum of y (constant between samples) with m di�erent draws of the uniform distribution. The main 24We will use 1.0E-10 as Machado and Santos-Silva (2005) did. 18 advantage of this procedure is that the resulting estimator is more e� cient than the one obtained from a single draw and a misspeci�cation-robust estimator of the covariance matrix is available. The importance of this procedure derives from the possibility of performing inferences on the variable of interest y. Machado and Santos-Silva (2005) showed that marginal e�ects of the smoothed variable y are easily obtained and interpreted and that there is a correspondence between the two quantile functions: Qy( jx) = dQy( jx) 1e, where dae denotes the ceiling function (returns the smallest inte- ger greater than, or equal to a). Because of the monotone transformation of y( T(y; )), the relationship between coe� cient estimates b ( ) and yand y is essentially non-linear, making it hard to interpret b ( ) in terms of y and y. It is possible to test the null hypothesis that a covariate has no e�ect on Qy( jx) because it is equivalent to test whether the variable has no impact on the Qy( jx). The problem is when the variable is signi�cant in Qy( jx). In such case it could be non signi�cant in the conditional quantile of y.25 This occurs because di�erent quantiles of ycorrespond to the same quantiles of y. In fact, a change in xj will a�ect Qy( jx) only if it is capable of changing the integer part of Qy( jx): Machado and Santos-Silva (2005) call this "magnifying glass e�ect" of Qy( jx). 4.2 Empirical speci�cation: treatment e�ects Our empirical work presents two main di�erences relative to general treatment e�ects approaches. Firstly, the study is about a potential reform, not a real one, as it is usually the case. We can state our interest as to measure the potential impact of the elimination of double coverage (par- ticularly the insurance plans provided to public employees) on the demand of health services, i.e. the potential decrease in the demand for health services amongst the subsystems bene�ciaries due to their double coverage. To proxy such impact we study the di�erences in the consumption of doctor consultations between NHS and public and private subsystems. Generally, the estima- tion of the impact of a reform occurs after its implementation and uses panel data comparing the outcome before and after the reform (Winkelmann 2006). In that case, the typical empir- 25It is not possible to just look at j, as it becomes necessary to evaluate case by case if a given magnitude in xj induces changes in the -quantile of y. Inference about the partial e�ect of a particular variation of the regressor, given that all other variables remain �xed at ex is made through the following expression: jQy( jex;x0j;x1j) = Qy( jex;x1j) Qy( jex;x0j) 19 ical strategies include pre-reform/post-reform di�erences-in-di�erences where one compares the changes in the utilization between a�ected and una�ected sub-populations. A drawback in our analysis relative to more general approaches is that we estimate the current impact (the impact in 2005), which may change in case of di�erent time paths between groups. But an advantage is that we analyze not only the average e�ect but also the impact on the whole outcome distri- bution. With quantile regression we are able to see if the policy impact di�ers depending of the outcome on the realization of the dependent variable. As laid out in the previous section, and now presented in a more speci�c way, the conditional quantiles are de�ned as26 Qy( jx) = + exp[ 0( ) + 1( )pubsubi + 2( )privsubi + ( )zi]; (7) 0 < < 1 and i = 1;2;:::;35;308 where pubsubi and privsubi represent persons "treated" as belonging to the "public insurance health subsystem" and "private insurance health subsystem", respectively. The vector zi includes all other characteristics that were controlled for in this regression. In addition to all independent variables referred in section 4.2, we use a third order polynomial in age and a third order poly- nomial in age crossed with the gender variable (agefemale). Note that it is absolutely crucial in this analysis to assume ignorability of the treatment conditional on a set of covariates. The alternative to assume ignorability and estimate treatment e�ect with di�erence in sample means is obviously bad since, as we tested, there are huge di�erences between control and treatment groups across their baseline characteristics. Moreover, when selecting the variables we guarantee that treated and untreated groups have a common support by using only observations in the intersection of the domains.27 This procedure makes us exclude individuals with more than 80 years old and a variable related to unemployment status. We discard a selection bias problem. The exogeneity of the treatments holds because it is very implausible that individuals want to work as public employees or in companies with private subsystems just to bene�t from this additional health insurance (Barros et al. 2008). Note 26The vector of coe� cients is now ( ) = [ 0( ); 1( ); 2( ); ( )] , being 0( ); 1( ); 2( ) scalar and ( ) a vector. 27It is necessary to have subpopulations in each state: NHS, privsub and pubsub. See (Wooldridge) for details. 20 that they have an alternative since we are studying a country that provides universal coverage through the NHS. Moreover, it is also unlikely that employers choose individuals on the basis of unobservable variables related to their health or even household health. The only requirement is that the potential employee (and not his household) is physically capable and has no infectious disease which could be controlled through our set of pre-determined variables. Nevertheless, even controlling for a large set of health status variables, this kind of procedure can still underestimate 1( ) and 2( ) if the subsystems bene�ciaries enjoy more or better treatment than the NHS bene�ciaries. This is because over life, better health care would translate into a signi�cant accumulation of health advantages not totally captured in zi. Following the advice of (Barros et al. 2008), we will test this possibility restricting the analysis to young bene�ciaries who did not yet had time to accumulate such advantages and compare the results with the larger sample. Note that the coe� cients 1( ) and 2( ) cannot be totally associated with moral hazard behaviour but it is instead a joint e�ect of moral hazard from the bene�ciaries and supply-induced demand from the providers. The latter is more likely because doctors may require more tests in order to justify more visits. According to (Barros et al. 2008), since the payments to subsystems providers are relatively low, the magnitude of the e�ect will be very small. Independently of that, the important here is to capture how much the systems design increases the consumption of resources related to consultations. 5 Results The results were obtained from the qcount package of STATA (Miranda 2006) with some slight adjustments. Regarding the number of jittered samples used to obtain the results, preliminary experiments showed that the coe� cients are not very sensitive to a particular sample of uniform random variables used to jitter the data: with 1500 samples almost no changes were detected both in coe� cients and in standard deviations.28 The decision on which quantiles to compute took into account the problem under analysis and the empirical distribution of the relevant outcome. Since the marginal quantiles are zero for all 6 0:50, it becomes more interesting to compute conditional quantiles on the upper tail of the distribution where the e�ect of covariates changes rapidly. Note that in the lower tail, a variation in the conditional quantiles of the arti�cial 28This result was no surprising due to the high number of observations of our database. 21 outcome Qy( jx) may be mostly due to the random noise that has been added. Therefore, we expect to �nd quantiles more �at. Moreover, it is more interesting to look at the behaviour of individuals who make heavy use of health care. In this scenario, and despite the fact that we will still be presenting the �rst quartile, we will focus on quantiles above the median, computing results for each decile after the median. Table 4 presents the parameter estimates of the quantiles regressions (the corresponding standard errors are shown in Table A1 of the Appendix). As we can see, quantile regression does not restrict the way regressors a�ect di�erent regions of the distribution, allowing the assessment of whether health insurance systems have signi�cant and variable impacts over the di�erent outcomes. The signs of the regressors do not switch across the di�erent quantiles (except for the dummy summer, whose e�ect, albeit highly insigni�cant, is positive in the lower tail and becomes negative in the upper quantiles). Regarding the statistical signi�cance, all variables are signi�cant in at least one quantile. In the group of health status regressors, the covariates that control for current medical conditions are highly signi�cant as expected. Among the chronic diseases dummies, only the cerebral haem- orrhage e�ect is not signi�cant in quantiles above the 0:7y quantile. Concerning indicators related to attitudes with impact on health status, we �nd that both the number of meals and smoking habits are insigni�cant in the upper tail of the distribution. In the case of socioeconomic characteristics, the household size, education and income e�ects are signi�cant at the one per cent level except for the last decile. Most of the variables related to the region of residence and to seasonality have a signi�cant impact on the consumption of visits to doctors. 22 Table 4: Quantile regression results: coe� cients _ (0:25) _ (0:50) _ (0:60) _ (0:70) _ (0:80) _ (0:90) Health insurance status variables pubsub 0.078 0.088 0.095 0.096 0.073 0.055y privsub 0.200 0.229 0.247 0.232 0.185 0.148 Health status variables sick 0.680 0.602 0.590 0.601 0.547 0.772 limitdays 0.071 0.073 0.076 0.074 0.071 0.073 limited 0.136y 0.205y 0.247 0.321 0.335 0.368 rheumatism 0.134 0.140 0.139 0.140 0.148 0.150 osteoporosis 0.282 0.207 0.182 0.152 0.115 0.091 cancer 0.468 0.464 0.430 0.386 0.403 0.525 kidneystones 0.149 0.154 0.175 0.188 0.221 0.211 renalfailure 0.167z 0.220 0.212 0.226 0.260 0.234 emphysema 0.090z 0.210 0.222 0.227 0.232 0.238 cerebralhaemorrhage 0.133y 0.135y 0.134y 0.163 0.191 0.189 infarction 0.228 0.327 0.343 0.341 0.290 0.217 depressivedisorder 0.187 0.231 0.247 0.253 0.246 0.248 otherchronicaldisease 0.435 0.451 0.471 0.458 0.384 0.352 highbloodpressure 0.407 0.382 0.367 0.322 0.260 0.208 chronicpain 0.172 0.197 0.220 0.230 0.221 0.224 diabetes 0.449 0.368 0.340 0.316 0.293 0.292 asthma 0.290 0.325 0.339 0.340 0.275 0.230 stress 0.441 0.360 0.342 0.305 0.293 0.250 smoker -0.205 -0.176 -0.168 -0.154 -0.095 -0.034z meals 0.188 0.158 0.129 0.114 0.081y 0.070y Socioeconomic and demographic variables householdsize -0.063 -0.060 -0.060 -0.060 -0.039 -0.017y age -1.072 -1.014 -1.048 -1.071 -0.727 -0.559 age2 0.234 0.222 0.231 0.241 0.160 0.121 age3 -0.015 -0.014 -0.014 -0.015 -0.010 -0.007 agefemale 0.558 0.580 0.641 0.750 0.490 0.335 (agefemale)2 -0.120 -0.129 -0.146 -0.181 -0.116 -0.078 (agefemale)3 0.007z 0.008 0.009 0.012 0.007 0.005y female -0.321y -0.321 -0.345 -0.357 -0.216 -0.091z educmax 0.010 0.014 0.015 0.015 0.010 0.005y lincome 0.069 0.058 0.060 0.060 0.053 0.030z single -0.218 -0.198 -0.202 -0.218 -0.164 -0.116 student -0.252 -0.246 -0.272 -0.253 -0.179 -0.172 retired 0.168 0.149 0.134 0.115 0.120 0.143 Notes: Coe� cients marked with z and yare not signi�cant at a 5 and 1 per cent level, respectively. Standard errors can be found in the Table A1 of the Appendix. Geographic and seasonal controls not reported. Available from the authors upon request. We now turn to the analysis of the e�ects of the regressors beyond their statistical signi�cance. The direct interpretation of Table 4 may suggest some misleading conclusions. Note that _ ( ) is a vector of linear partial e�ects on QT(y; )( jx). To fully understand the impacts, the analysis 23 should be made through Qy( jx), which is not so easily computed due to its non-linearity as well as to the fact that it is a function of quantile. Being non-linear, the parameter provides an incomplete picture of the covariates�e�ects on the shape of the distribution. And being a function of implies, for example, that a variable with the same estimated coe� cient in all quantiles will have a proportional e�ect that varies with quantile. To take into account the non-linearity, Table A2 of the Appendix presents estimates of the partial e�ects computed setting the continuous (and count) variables at the mean of the sample and the dummy variables equal to zero (ex).29 Inference for the marginal e�ect of a dummy xj given that all other variables remain �xed at ex is made through Qy( jex;xj = 1)Qy( jex;xj = 0) = exp( j( )) 1[Qy( jex) ] and for a continuous variable xl is l( )[Qy( jex) ].30 To facilitate the comparison of the e�ects across the di�erent models we also estimate the semi-elasticities of Qy( jx) with respect to the covariates. This is done by simply dividing the partial e�ect by Qy( jex). Table 5 shows the results. Using the quantile regression framework it may happen that a signi�cant coe� cient of a variable on y quantile may not a�ect a particular conditional y quantile. But when it is found that the y quantile depends on the covariate for several quantiles, then it should be possible to detect a subpopulation for which the semi-elasticity on y quantile is di�erent from zero (Miranda 2008). For example, if we consider the median and compute the Qy(0:50jx = ex) we obtain 0:79 as a consequence Qy(0:50jx = ex) is equal to zero consultations. When the typical individual (ex) changes to the public health plan, it is expected that an increase in Qy to 0:82, but leaving Qy unchanged. Hence the marginal e�ect of the public subsystem on the y quantile is zero, even though it has a signi�cant positive e�ect on y quantile. Conversely, if we utilize the sixth decile the Qy(0:60jx = ex) is equal to 0:97 and as a consequence Qy(0:60jx = ex) is also equal to zero consultations. But now, a change from NHS to a public subsystem will increase Qy to 1:01 making Qy equal to one consultation. 29The default individual is a healthy man with an average household size, educational level and income, not single or retired, living in the Centre region of Portugal and interviewed in autumn. Also note that, the vector ex is set with the dummies pubsub and privsub equal zero, so the default individual has the NHS insurance plan (is from the control group). 30The marginal e�ects of some covariates are calculated in a di�erent way. This is the case of the income that is computed as lincome( )1=income[Qy( jex) ], the "age when male" that is set as [ age( )+2 age2( ) age+ age3( )age2][Qy( jex) ], and the "age when female" that is [ age( )+ agexfemale( )+2( age2( )+ (agexfemale)2( ))  age + ( age3( ) + (agexfemale)3( ))  age2]  [Qy( jex) ]: 24 Table 5: Quantile regression results: semi-elasticities SE(0.25) SE(0.50) SE(0.60) SE(0.70) SE(0.80) SE(0.90) Health insurance status variables pubsub 0.031 0.034 0.038 0.042 0.037 0.032 privsub 0.083 0.095 0.107 0.109 0.100 0.092 Health status variables sick 0.367 0.304 0.307 0.343 0.359 0.668 limitdays 0.028 0.028 0.030 0.032 0.036 0.043 limited 0.055 0.084 0.107 0.158 0.196 0.256 rheumatism 0.054 0.055 0.057 0.063 0.078 0.093 osteoporosis 0.123 0.084 0.076 0.069 0.060 0.055 cancer 0.225 0.217 0.205 0.196 0.245 0.397 kidneystones 0.061 0.061 0.073 0.086 0.122 0.135 renalfailure 0.069 0.091 0.090 0.106 0.146 0.151 emphysema 0.035 0.086 0.095 0.106 0.129 0.154 cerebralhaemorrhage 0.054 0.053 0.055 0.074 0.104 0.120 infarction 0.131 0.142 0.156 0.169 0.166 0.139 depressivedisorder 0.078 0.096 0.107 0.120 0.137 0.161 otherchronicaldisease 0.205 0.210 0.230 0.242 0.231 0.242 highbloodpressure 0.189 0.171 0.169 0.158 0.146 0.133 chronicpain 0.071 0.080 0.094 0.108 0.122 0.144 diabetes 0.214 0.164 0.154 0.155 0.168 0.195 asthma 0.127 0.141 0.154 0.169 0.156 0.148 stress 0.209 0.159 0.156 0.149 0.168 0.163 smoker -0.070 -0.060 -0.059 -0.059 -0.044 -0.019 meals 0.078 0.063 0.052 0.050 0.042 0.042 Socioeconomic and demographic variables householdsize -0.024 -0.022 -0.023 -0.025 -0.019 -0.010 age when male 0.005 0.004 0.005 0.006 0.004 0.003 age when female 0.001 0.000 0.000 -0.001 -0.001 -0.001 female 0.184 0.178 0.200 0.243 0.195 0.190 educmax 0.004 0.005 0.006 0.006 0.005 0.003 income 0.004 0.003 0.004 0.004 0.004 0.003 single -0.074 -0.066 -0.070 -0.082 -0.075 -0.063 student -0.084 -0.080 -0.091 -0.093 -0.081 -0.091 retired 0.069 0.059 0.055 0.051 0.063 0.088 Notes: Semi-elasticiies are calculated for a vector ex containing the mean value of the continuous (and count) variables and zeros for the dummy variables. The type of the covariates is presented in Table 2 and the mean values can be obtained from Table 3. Geographic and seasonal controls not reported. Available from the authors upon request. Starting with the analysis of insurance treatment e�ects, it is visible that they do not change a lot across the estimated quantiles, but it is possible to �nd a pattern: both public and private subsystems have an increasing positive e�ect on the number of doctor visits until the 0:60y 0:70y quantiles and a decreasing positive e�ect thereafter (Table 5). The similarities between the patterns of both subsystems are clear when we compute the ratio between them across quantiles, 25 since it remains almost unchanged. In fact, the e�ect of private insurance plans is between 2:6 and 2:9 times higher than the impact of those of public employees. Therefore, health insurance double coverage does lead to further demand of health care (visits). The origin of double coverage is also quite important, as private subsystems double coverage induces much more demand than public subsystems double coverage. To better understand the e�ect of health subsystems on the demand for health care we used the point estimates to predict the y quantile (note that here we use the relevant outcome) for each observation in a simulation exercise in which all variables are set equal to their actual values, except the health insurance status. About this one three possibilities are considered: no treatment, public subsystem or private subsystem. The results measured by relative frequencies are presented in Table 6. Given that half of the sample has zero visits, it is not surprising that the �rst conditional quartile is zero for almost all observations. When we compare the estimates from di�erent quantiles, we have the perception that the distribution changes di�erently across the health insurance plans. For example, the proportion of individuals with a predicted quantile of zero or one consultation is always lower with the treatment (either public or private) than with NHS, but these relative e�ects change with the quantile. More particularly, the proportion of NHS individuals is 91:0, 70:7 and 23:4 per cent for the 0:50y, 0:75y, 0:90yquantile, respectively, while with the "public subsystem" the proportion is 89:6, 66:4 and 19:5 per cent for the 0:50y, 0:75y, 0:90yquantile, respectively. This means that holding double coverage causes a decreasing path in the di�erence of proportion of individuals with a certain (increasing) number of visits that is steeper from the 0:50yquantile to the 0:75yquantile than from the 0:75yquantile to the 0:90yquantile. 26 Table 6: Frequencies of estimated quantiles for the number of visits to a doctor 0 1 2 3 4 5 6 7 8 9 > 10 NHS _ Qy(25jx) 89:4 8:3 1:4 0:4 0:2 0:1 0:1 0:0 0:0 0:0 0:0 _ Qy(50jx) 58:2 32:8 5:5 1:7 0:7 0:4 0:2 0:1 0:1 0:1 0:2 _ Qy(75jx) 1:3 69:3 17:9 5:6 2:5 1:1 0:7 0:5 0:2 0:2 0:7 _ Qy(90jx) 0:0 23:4 46:3 15:1 6:2 3:1 1:8 1:1 0:7 0:5 1:8 Public subsystem _ Qy(25jx) 87:9 9:4 1:6 0:5 0:3 0:1 0:1 0:0 0:0 0:0 0:0 _ Qy(50jx) 54:0 35:7 6:3 2:0 0:9 0:5 0:2 0:2 0:1 0:1 0:2 _ Qy(75jx) 0:7 65:7 20:1 6:5 2:9 1:4 0:8 0:5 0:3 0:2 0:8 _ Qy(90jx) 0:0 19:5 47:2 16:6 6:7 3:5 1:9 1:2 0:8 0:6 2:0 Private subsystem _ Qy(25jx) 83:6 12:3 2:4 0:8 0:3 0:2 0:1 0:1 0:1 0:1 0:1 _ Qy(50jx) 46:8 40:3 7:5 2:6 1:2 0:6 0:3 0:2 0:1 0:1 0:3 _ Qy(75jx) 0:2 60:0 23:4 7:6 3:6 1:8 0:9 0:7 0:5 0:3 1:0 _ Qy(90jx) 0:0 13:2 47:7 19:5 7:7 4:0 2:3 1:5 1:0 0:7 2:4 Notes: Estimates are based on a simulation exercise that start by predicting the y quantile for all 35,308 individuals setting all control variables in their actual values except the health insurance status, which is set in the three possible cases. After that, the y quantiles are computed applying Qy( jx) =dQy( jx) e and tabulated in their possible values. Regarding the e�ects of health status variables as a whole, it is visible that most of the regressors have a positive e�ect that increases with . Being sick seems to be especially important to determine whether or not the individual visits a doctor but, taking into consideration the results of the last decile, it is much more important in explaining the subsequent visits. The same kind of behaviour is observed for the handicapped e�ect, since for the �rst quantiles it is not signi�cant whereas for higher levels of consumption it becomes a very important explanatory variable. In the case of the sickness e�ect this does not happen, which can be explained by the fact that only in the 0:90yquantile the impact is substantial whereas the variable limited becomes gradually more relevant. Amongst the chronic diseases we found evidence of a positive increasing e�ect along the estimated quantiles, except for the dummy osteoporosis that has a decreasing impact, while infarction, otherchronicaldisease, highbloodpressure, diabetes and asthma have a constant e�ect in the di�erent parts of the distribution. The proxy for the level of exposure to stress has an e�ect that does not vary much across quantiles, and the other regressors related 27 to attitudes towards health care have decreasing e�ects. The negative and decreasing impact of being a smoker contrasts with the results of Louren�o (2007), which although using a slightly di�erent variable found positive e�ects on the consumption of doctor visits. Another interesting result is that having the habit of eating more times a day has also a positive impact. These results show that individuals that take better care of their health by not smoking and having a higher number of meals also complement their care by being more pro-active in the visits to doctors. These attitudes towards health care seem more than o�set the impact of the improved health (and correspondingly lower demand for doctor visits) stemming from non-smoking and having a higher number of meals. Note that in the lower tail of the outcome distribution it is more clear the �rst situation and in the upper tail, the second situation may play a more important role. The impact of variables related to the socioeconomic characteristics seem to be similar across quantiles. Concerning the household size e�ect, the results indicate that an individual consumes on average less consultations if the number of members of his/her household is larger. These results are in accordance with the previous parametric models and are similar to the ones found by Winkelmann (2006). A possible economic explanation for this e�ect is the presence of "economies of experience" within the family due to the fact that decisions taken by more than one person bene�t from more in-depth information, which on its turn in�uence health status and e� ciency in producing healthy times. It is also plausible that scale economies play a role if it is true that when visiting a doctor patients often also ask for symptoms of diseases of their relatives in order to prevent further visits. Regarding the e�ect of age, from Figure 1 we see that the consumption is very high in the �rst years of life and decreases until 3040 years old, more for men than for women, and thereafter it increases for men while remaining fairly constant for women. These results seem intuitive and are consistent with the literature: the initial decreasing path may be related to the fact that children often require more health care (having therefore periodic doctor appointments); and after some point in the life cycle it is expected an increasing recourse to health services both if we consider that age is a health status proxy or a indicator of the depreciation rate (Grossman 1972). Most of the applications studying health care demand consider that the age has a quadratic relationship with health care utilization (Pohlmeier and Ulrich 1995, Winkelmann 2006 and Louren�o 2007). 28 We �rst tried to do so but both coe� cients did not appear signi�cant and we found that a third order polynomial allows a much better �t to the data. Additionally, we modelled the ageing and gender e�ects together, in contradiction with the literature. Note that in our speci�cation, it makes little sense to interpret the dummy female alone. The advantages of assessing the ageing e�ect by gender type are clear from Figure 1: men tend to consume less while women�s behaviour towards health demand is smoother over the life cycle. By comparing the e�ects on the median with the 0:80y quantile, we observe that the shape of the e�ects is similar but as a whole the impact of age is less pronounced in explaining high levels of visits to a doctor. This is very much in line with the results of Winkelmann (2006) that shows that age in the upper tail of the distribution of the number of visits has an insigni�cant e�ect. Figure 1: E�ect of age in the 0.5y*-quantile and 0.8y*-quantile Qu a n til e 0 .5 0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 10 20 30 40 50 60 70 80 age M a le Qu a n til e 0 .8 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 0 10 20 30 40 50 60 70 80 age F e m ale The level of income has a positive negligible e�ect on health care demand, constant across the di�erent quantiles. Conceptually, it is possible to �nd at least two channels of income in�uences. The �st derives from the Grossman�s model (1972), in which the income determines the budget constraint and, therefore, the ability to pay for health care. The second channel is related to be fact that di�erent levels of income can explain di�erences in the opportunity cost of being ill and in the cost of visiting the doctor, especially if we closely relate income with the wage rate. In Portugal, the �rst channel may not actually exist as a consequence of the design of health care systems. This is broadly applicable to both private and public subsystems and to NHS bene�ciaries, although the latter in a minor extent. Direct costs of bene�ciaries are relatively small as most of the cost of a consultation is supported by the health care system, 29 which is �nanced predominantly by general taxation or employers and employees compulsory contributions.31 In this context, the second channel can be more relevant and it is consistent with the estimated small e�ect of income over all the outcome distribution. Also the educational level has a small positive impact on health care demand that does not change signi�cantly across the estimated quantiles. This appears to indicate that individuals with high educational levels face a higher opportunity cost of being ill and this more than o�set the opportunity cost of visiting the doctor. Also, there is no evidence regarding the idea that more educated people are able to improve health more e� ciently generating fewer doctors�consultations. The previous empirical evidence of Pohlmeier and Ulrich (1995), Winkelmann (2006) and Louren�o (2007) also found small positive e�ects for both income and education variables. Concerning the marital status in�uence, the results point out that single people visit doctors less often. These �ndings may indicate that they are less risk-averse regarding their health. As to the occupational status, the estimated semi-elasticities are positive for retired individuals and negative for students, meaning that the demand for health care increases over the life cycle, being lower when we study, higher when we work and much higher when we retire.32 5.1 Cumulative health e�ects of double coverage�s age As mentioned in Section 4.2, some individuals may have enjoyed health insurance double coverage for a long period of time which may generate health bene�ts from a hypothetical better treatment accumulated over time. If this occurs, the di�erence in the number of consultations between the two groups should decrease with age. The idea is that the recent bene�ciaries of a health subsystem (more likely the younger generations) did not have time to accumulate such health bene�ts, whereas the older bene�ciaries (more likely the older generations) had time to do so, and that will make them relatively healthier when compared with untreated individuals. If this behaviour is not fully controlled by the health status variables, 1( ) and 2( ) previously estimated can be positively bias. Following Barros et al. (2008), we estimate our speci�cation in di�erent age groups, within the quantile regression framework. Tables A3 and A4 of the Appendix present the estimated coe� cients and semi-elasticities obtained from a subsample of 31In this scenario, it makes no sense to do simulations of how much should the wages of public employees be increased in order to compensate them for the elimination of their health insurance plan. 32In the interpretation of the results we should be aware that these particular variables may capture to some extent Grossman�s income and age e�ects. 30 individuals with more than eighteen years old (28,736 observations) and Tables A5 and A6 of the Appendix show results for a subsample of individuals with age between eighteen and forty �ve (12,637 observations).33 When compared with the full sample results34, we observe di�erences in all groups of variables and as expected the coe� cients of the third order polynomial in age and the third order polynomial in age crossed with the gender dummy are now not signi�cant. Figure 2: E�ects of the public and private subsystems in di�erent groups�age 0. 00 0. 05 0. 10 0. 15 0. 20 0. 25 25 25 50 50 60 70 80 90 P e rc e nti l (% ) β 1 β 1, age > 18 β 1, 18< age < 45 0. 00 0. 10 0. 20 0. 30 0. 40 0. 50 25 25 50 50 60 70 80 90 P e rc e nti l (% ) β 2 β 2, age > 18 β 2, 18< age < 45 Figure 2 provides a graphical comparison of the estimated coe� cients 1( ) and 2( ) fo- cusing on the upper tail results. The most important fact is that the e�ects of both public and private subsystems are higher for the younger generations and this occurs in the whole distribu- tion. When we restrict the analysis to observations with more than eighteen years old, thus rising the average age, both 1( ) and 2( ) decrease (slightly more in the upper tail of the distrib- ution), whereas the younger cohort (the one with individuals with more than eighteen and less than forty �ve) has the largest estimated treatment e�ects. The di�erences are very expressive, especially for the public employees. This is consistent with Barros et al. (2008) �ndings. For di�erent levels of visits to the doctor, bene�ciaries from private subsystems and pub- lic subsystems behave now in a quite di�erent way. Regarding the public subsystems, quantile regression results show that the treatment e�ect of the younger cohort decreases considerably 33This exercise is also an interesting way of performing a sensitivity analysis of whether the variable age is properly speci�ed in the models Qy( jx). 34Note that the sample used in the previous section includes individuals from zero to eighty years old. 31 across the distribution, which indicate that the moral hazard is relatively lower among young high users. Also note that this was not the case for the full sample and that the coe� cients of the di�erent age groups are similar in the 0:90y- quantile. For the private subsystem, the estimated impact of the younger group increases until the 0:70y- quantile and decreases thereafter. This a similar path to the one obtained with the full sample. The results seem to con�rm the sus- picion that the estimated e�ects for the elder groups are lower, possibly re�ecting accumulated health bene�ts from the existence of the subsystems. In this context the best indicator of moral hazard would be one obtained from the sample of individuals that possibly did not have time to incorporate such bene�ts. The caveat is the reduction of the sample, in particular of the treated individuals. 6 Conclusions This paper examines the impact of additional coverage on the demand of visits to the doctors at di�erent levels of the outcome distribution, contributing to the empirical literature on moral hazard in the health sector. Using a recent quantile regression method for count data, we overcome the limitation of traditional parametric count data models by investigating the e�ect of covariates on the shape of the distribution. We discarded the selection bias problem by using only individuals that enjoy an exogenous health insurance double coverage and by analysing its impact in di�erent age cohorts. The results show that the additional coverage is very important in explaining the demand for doctor consultations, especially in the lower tail and the middle of the distribution. That is, double coverage leads to a relatively higher increase in demand (visits to a doctor) for regular (but not heavy) users of the health system. When the e�ects of the public and of the private health insurance plans (providing double coverage) are compared it is clear that the moral hazard derived from private health insurance double coverage is much higher than the one derived from the health insurance plan of public employees. Another important �nding is that the relative e�ect of both sources of double coverage is almost constant across quantiles, which means that they display a similar path along the distribution. The analysis for the youngest cohort shows that the estimated e�ects of both public and private health insurance on top of the NHS are 32 higher than the ones for the full sample, possibly re�ecting accumulated health bene�ts. The estimation of a positive e�ect of the double coverage derived from the subsystems corrob- orates the �ndings from traditional one-part and two-part models (Moreira 2008). Nevertheless, quantile regression provides us a more detailed description of the e�ect of the treatments on the distribution of doctor visits, thus becoming a valuable tool to complement the parametric models. To explain the di�erences in the demand for doctor consultations between the di�erent health insurance status we control for several demographic, socioeconomic and health status variables, besides the geographic and seasonal e�ects. Results indicate that the existence of chronic diseases or pain is extremely relevant in explaining doctor visits, especially for high users. Among the demographic and socioeconomic characteristics, age (also as proxy of health status) assumes a unique role, especially when combined with gender. In the �rst years of living the consumption of health care is very high and it decreases until 30-40 years old, more for men than for women, and thereafter it increases for men and remains fairly constant for women. Education and income present signi�cant positive e�ects (constant over the whole distribution) although less important than those of other regressors. Results from quantile regression are similar to those from previous literature in terms of the signi�cance of key covariates, but the combination of age and gender is novel in the literature. In short, health insurance double coverage creates additional demand for health care. This additional demand e�ect is slightly higher for medium-intensity users than for heavy users. Also interesting is the large di�erence in impact according to the source of health insurance double coverage. The second layer of health insurance coverage adds more to demand when provided by private organizations than when obtained from Government-sponsored entities. 33 References Bago-d�Uva, T. and A. Jones (2008). Health care utilisation in europe: New evidence from the ECHP. Journal of Health Economics doi:10.1016/j.jhealeco.2008.11.002. Barros, P., M. Machado, and A. Galdeano (2008). Moral hazard and the demand for health services: A matching estimator approach. Journal of Health Economics 27(4), 1006�1025. Barros, P. P. and J. Sim�es (2007). Portugal: Health systems review, Volume 9. Health systems in Transition. Cameron, A. and P. Trivedi (2005). Microeconometrics: methods and applications. Cambridge. Cameron, A., P. Trivedi, F. Milne, and J. Piggott (1988). A microeconometric model of the demand for health care and health insurance in Australia. The Review of Economic Stud- ies 55(1), 85�106. Coulson, N., J. Terza, C. Neslusan, and B. Stuart (1995). Estimating the moral-hazard e�ect of supplemental medical insurance in the demand for prescription drugs by the elderly. The American Economic Review 85(2), 122�126. Deb, P. and P. Trivedi (2002). The structure of demand for health care: latent class versus two-part models. Journal of Health Economics 21, 601�625. Grossman, M. (1972). On the concept of health capital and the demand for health. The Journal of Political Economy 80(2), 223�255. Hausman, J., B. Hall, and Z. Griliches (1984). Econometric models for count data with an application to the patents RD relationship. Econometrica 52, 909�938. Horowitz, J. L. (1992). A smooth maximum score estimatior for the binary response model. Econometrica 60, 505�531. Koenker, R. and G. Bassett (1978). Regression quantiles. Econometrica 46, 33�50. Liu, C. (2007). Utilization of general practitioners�services in canada and the united states: A quantile regression for counts analysis. mimeo. Louren�o, O.D.(2007). Unveiling health care consumption groups.PhDdissertation, Faculdade de Economia da Universidade de Coimbra. 34 Machado, J. and J. Santos-Silva (2005). Quantiles for counts. Journal of the American Statis- tical Association 100, 1226�1237. Manning, W., L. Blumberg, and L. Moulton (1995). The demand for alcohol: The di�erential response to price. Journal of Health Economics 14(2), 123�148. Manski, C. F. (1975). Maximum score estimation of the stochastic utility model of choice. Journal of Econometrics 3, 205�228. Manski, C. F. (1985). Semiparametric analysis of discrete response: Asymptotic properties of the maximum score estimatior. Journal of Econometrics 27, 313�333. Miranda, A. (2006). Qcount: Stata program to �t quantile regression models for count data. Statistical Software Components S456714, Boston College Department of Economics. Miranda, A. (2008). Planned fertility and the family background: A quantile regression for count analysis. Journal of Polulation Economics 21, 67�81. Moreira, S. (2008). Double coverage and health care demand: evidence from quantile regression. Msc dissertation, ISEG-TULisbon. Pereira, J. (1995). Equity, health and health care: an economic study with reference to Por- tugal. Department of Economics and Related Studies of the University of York 341. Pohlmeier, W. and V. Ulrich (1995). An econometric model of the two-part decisionmaking process in the demand for health care. The Journal of Human Resources 30(2), 339�361. Vera-Hern�ndez, A. (1999). Duplicate coverage and demand for health care: The case of Catalonia. Health Economics 8, 123�148. Windweijer, F. and J. Santos-Silva (1997). Endogeneity in count data models: An application to demand for health care. Journal of Applied Econometrics 12(3), 281�294. Winkelmann, R. (2004). Health care reform and the number of doctor visits: An econometric analysis. Journal of Applied Econometrics 19, 455�472. Winkelmann, R. (2006). Reforming health care: evidence from quantile regressions for counts. Journal of Health Economics 25, 131�145. Wooldridge, J. M. (2002). Econometric analysis of cross section and panel data. MIT press. 35 Zweifel, P. and W. G. Manning (2000). Moral hazard and consumer incentives in health care, Volume 1A of Handbook of Health Economics, Section I6, pp. 409�459. Elsevier Science. 36 A Appendix Table A1: Quantile regression results: standard errors _ (0:25) _ (0:50) _ (0:60) _ (0:70) _ (0:80) _ (0:90) Health insurance status variables pubsub 0.032 0.029 0.028 0.028 0.024 0.025 privsub 0.070 0.055 0.053 0.051 0.046 0.053 Health status variables sick 0.097 0.095 0.081 0.106 0.120 0.097 limitdays 0.004 0.003 0.003 0.003 0.003 0.004 limited 0.094 0.081 0.080 0.078 0.077 0.073 rheumatism 0.032 0.025 0.024 0.023 0.023 0.024 osteoporosis 0.039 0.031 0.030 0.029 0.028 0.032 cancer 0.065 0.050 0.042 0.047 0.056 0.068 kidneystones 0.045 0.039 0.038 0.038 0.035 0.038 renalfailure 0.075 0.069 0.064 0.065 0.059 0.067 emphysema 0.060 0.049 0.043 0.043 0.046 0.050 cerebralhaemorrhage 0.076 0.055 0.054 0.061 0.051 0.062 infarction 0.079 0.063 0.063 0.053 0.050 0.065 depressivedisorder 0.042 0.035 0.032 0.030 0.030 0.034 otherchronicaldisease 0.025 0.021 0.020 0.019 0.017 0.019 highbloodpressure 0.030 0.024 0.022 0.021 0.020 0.022 chronicpain 0.031 0.026 0.025 0.023 0.022 0.024 diabetes 0.037 0.028 0.027 0.028 0.028 0.030 asthma 0.047 0.038 0.036 0.035 0.035 0.037 stress 0.035 0.027 0.024 0.024 0.025 0.025 smoker 0.035 0.031 0.033 0.034 0.028 0.028 meals 0.044 0.041 0.040 0.039 0.036 0.036 Socioeconomic and demographic variables householdsize 0.010 0.009 0.008 0.009 0.008 0.007 age 0.080 0.067 0.066 0.071 0.063 0.059 age2 0.022 0.019 0.019 0.020 0.018 0.017 age3 0.002 0.002 0.002 0.002 0.001 0.001 age*female 0.112 0.095 0.093 0.092 0.082 0.085 (age*female)2 0.031 0.026 0.026 0.026 0.023 0.023 (age*female)3 0.002 0.002 0.002 0.002 0.002 0.002 female 0.109 0.092 0.089 0.083 0.074 0.085 educmax 0.003 0.003 0.003 0.003 0.002 0.002 lincome 0.021 0.018 0.019 0.018 0.016 0.017 single 0.041 0.037 0.038 0.040 0.034 0.034 student 0.042 0.039 0.039 0.042 0.034 0.034 retired 0.037 0.029 0.027 0.026 0.028 0.027 Note: Geographic and seasonal controls not reported. Available from the authors upon request. 37 Table A2: Quantile regression results: marginal e�ects ME(0.50) ME(0.50) ME(0.60) ME(0.70) ME(0.80) ME(0.90) Health insurance status variables pubsub 0.012 0.027 0.037 0.051 0.059 0.069 privsub 0.033 0.075 0.104 0.130 0.158 0.194 Health status variables sick 0.147 0.240 0.298 0.412 0.567 1.412 limitdays 0.011 0.022 0.029 0.038 0.057 0.092 limited 0.022 0.066 0.104 0.189 0.310 0.540 rheumatism 0.022 0.044 0.055 0.075 0.124 0.196 osteoporosis 0.049 0.067 0.074 0.082 0.095 0.116 cancer 0.090 0.172 0.199 0.235 0.386 0.839 kidneystones 0.024 0.048 0.071 0.104 0.192 0.285 renalfailure 0.027 0.072 0.088 0.127 0.231 0.320 emphysema 0.014 0.068 0.092 0.127 0.203 0.326 cerebralhemorrhage 0.021 0.042 0.053 0.089 0.163 0.253 infarction 0.052 0.112 0.152 0.203 0.262 0.293 depressivedisorder 0.031 0.076 0.104 0.144 0.216 0.341 otherchronicaldisease 0.082 0.166 0.223 0.291 0.364 0.512 highbloodpressure 0.076 0.135 0.164 0.190 0.231 0.280 chronicpain 0.028 0.063 0.091 0.129 0.192 0.305 diabetes 0.086 0.129 0.150 0.186 0.264 0.411 asthma 0.051 0.112 0.150 0.202 0.246 0.314 stress 0.084 0.126 0.151 0.178 0.264 0.344 smoker -0.028 -0.047 -0.057 -0.071 -0.070 -0.040 meals 0.031 0.050 0.051 0.060 0.066 0.088 Socioeconomic and demographic variables householdsize -0.010 -0.017 -0.022 -0.030 -0.030 -0.021 age when male 0.002 0.003 0.005 0.007 0.006 0.007 age when female 0.000 0.000 0.000 -0.001 -0.002 -0.002 female 0.074 0.141 0.194 0.292 0.308 0.402 educmax 0.002 0.004 0.006 0.008 0.007 0.007 lincome 0.002 0.003 0.004 0.005 0.007 0.006 single -0.030 -0.052 -0.068 -0.098 -0.118 -0.133 student -0.034 -0.064 -0.088 -0.112 -0.128 -0.192 retired 0.028 0.047 0.053 0.061 0.099 0.186 Notes: The marginal e�ects are calculated for a vector ex containing the mean value of the continuous (and count) variables and zeros for the dummy variables. The classi�cation of the covariates is presented in Table 2 and the mean values can be obtained from Table 3. Geographic and seasonal controls not reported. Available from the authors upon request. 38 Table A3: Quantile regression results: estimated coe� cients when age>=18 _ (0:25) _ (0:50) _ (0:60) _ (0:70) _ (0:80) _ (0:90) Health insurance status variables pubsub 0.070y 0.081 0.084 0.077 0.054y 0.032z privsub 0.174y 0.213 0.224 0.198 0.140 0.103z Health status variables sick 0.746 0.664 0.648 0.625 0.582 0.805 limitdays 0.065 0.067 0.068 0.067 0.064 0.066 limited 0.165z 0.234 0.271 0.334 0.352 0.379 rheumatism 0.141 0.150 0.149 0.145 0.149 0.148 osteoporosis 0.294 0.222 0.198 0.166 0.128 0.104 cancer 0.461 0.454 0.424 0.375 0.382 0.494 kidneystones 0.159 0.168 0.187 0.201 0.228 0.214 renalfailure 0.151y 0.193 0.197 0.198 0.238 0.215 emphysema 0.052z 0.152 0.169 0.165 0.184 0.213 cerebralhaemorrhage 0.135z 0.136 0.128y 0.143y 0.191 0.195 infarction 0.303 0.347 0.347 0.342 0.297 0.209 depressivedisorder 0.201 0.241 0.256 0.253 0.249 0.234 otherchronicaldisease 0.379 0.393 0.407 0.391 0.340 0.320 highbloodpressure 0.417 0.390 0.372 0.321 0.259 0.209 chronicpain 0.175 0.198 0.221 0.223 0.211 0.212 diabetes 0.449 0.364 0.336 0.307 0.289 0.293 asthma 0.232 0.268 0.283 0.278 0.241 0.193 stress 0.454 0.368 0.345 0.310 0.286 0.245 smoker -0.209 -0.182 -0.175 -0.157 -0.095 -0.036z meals 0.189 0.160 0.130 0.107 0.092 0.089y Socioeconomic and demographic variables householdsize -0.049 -0.048 -0.049 -0.048 -0.031 -0.012z age -0.365z -0.319z -0.359z -0.139z 0.177z -0.061z age2 0.094z 0.084z 0.096z 0.062z -0.021z 0.022z age3 -0.006z -0.005z -0.006z -0.004z 0.001z -0.001z age*female -0.078z 0.032z 0.226z 0.133z -0.108z 0.285z (age*female)2 0.007z -0.022z -0.068z -0.066z 0.004z -0.070z (age*female)3 -0.001z 0.001z 0.004z 0.005z 0.000z 0.004z female 0.660z 0.548z 0.357z 0.683z 0.701z 0.011z educmax 0.004z 0.010 0.011 0.010 0.006y 0.002z lincome 0.068 0.057 0.058 0.054 0.046 0.020z single -0.215 -0.189 -0.192 -0.202 -0.147 -0.100 student 0.039z 0.016z 0.014z 0.052z 0.064z 0.033z retired 0.190 0.166 0.152 0.135 0.137 0.163 Note: The subsample has 28,736 observations. Results were obtained with 1500 jittered samples. Coe� cients marked with z and yare not signi�cant at a 5 and 1 per cent level, respectively. Geographic and seasonal controls not reported. Available from the authors upon request. 39 Table A4: Quantile regression results: estimated semi-elasticities when age>=18 SE(0.25) SE(0.50) SE(0.60) SE(0.70) SE(0.80) SE(0.90) Health insurance status variables pubsub 0.026 0.030 0.032 0.033 0.027 0.018 privsub 0.069 0.084 0.093 0.089 0.073 0.061 Health status variables sick 0.400 0.334 0.336 0.353 0.383 0.696 limitdays 0.024 0.025 0.026 0.028 0.032 0.038 limited 0.065 0.093 0.115 0.162 0.204 0.259 rheumatism 0.055 0.057 0.059 0.064 0.078 0.090 osteoporosis 0.124 0.088 0.081 0.074 0.066 0.062 cancer 0.212 0.204 0.195 0.185 0.225 0.360 kidneystones 0.062 0.065 0.076 0.091 0.124 0.134 renalfailure 0.059 0.075 0.081 0.089 0.130 0.135 emphysema 0.019 0.058 0.068 0.073 0.098 0.134 cerebralhaemorrhage 0.052 0.052 0.050 0.063 0.102 0.122 infarction 0.128 0.147 0.153 0.166 0.167 0.131 depressivedisorder 0.081 0.097 0.108 0.117 0.137 0.149 otherchronicaldisease 0.167 0.170 0.185 0.195 0.196 0.213 highbloodpressure 0.187 0.169 0.166 0.154 0.143 0.131 chronicpain 0.069 0.078 0.091 0.102 0.114 0.133 diabetes 0.205 0.155 0.147 0.146 0.162 0.192 asthma 0.094 0.109 0.121 0.130 0.132 0.120 stress 0.207 0.157 0.152 0.148 0.160 0.157 smoker -0.068 -0.059 -0.059 -0.059 -0.044 -0.020 meals 0.075 0.061 0.051 0.046 0.047 0.052 Socioeconomic and demographic variables householdsize -0.018 -0.017 -0.018 -0.019 -0.015 -0.007 age when male 0.005 0.005 0.006 0.006 0.003 0.003 age when female 0.001 0.001 0.000 -0.001 -0.001 -0.003 female 0.120 0.116 0.131 0.145 0.123 0.139 educmax 0.002 0.003 0.004 0.004 0.003 0.001 income 0.004 0.003 0.003 0.004 0.004 0.002 single -0.070 -0.061 -0.065 -0.074 -0.066 -0.054 student 0.015 0.006 0.005 0.022 0.032 0.019 retired 0.076 0.064 0.061 0.059 0.071 0.100 Notes: The subsample has 28,736 observations. Results were obtained with 1500 jittered samples. Marginal e�ects are calculated for a vector ex containing the mean value of the continuous (and count) variables and zeros for the dummy variables. The classi�cation of the covariates is presented in Table 2 and the mean values care computed for this particular sample. Geographic and seasonal controls not reported. Available from the authors upon request. 40 Table A5: Quantile regression results: estimated coe� cients when 18<=age<=45 _ (0:25) _ (0:50) _ (0:60) _ (0:70) _ (0:80) _ (0:90) Health insurance status variables pubsub 0.220 0.209 0.193 0.177 0.114y 0.029z privsub 0.176z 0.284y 0.345 0.384 0.300 0.255y Health status variables sick 0.254z 0.515y 0.686y 0.732 0.634 0.651y limitdays 0.142 0.134 0.132 0.130 0.120 0.105 limited -0.044z 0.092z 0.117z 0.218z 0.347z 0.492 rheumatism 0.239 0.203y 0.229 0.263 0.267 0.276 osteoporosis 0.497z 0.524 0.501 0.469y 0.380 0.301z cancer 0.556 0.438y 0.422 0.312y 0.276z 0.482z kidneystones 0.136z 0.163z 0.270z 0.406 0.444 0.458 renalfailure 0.046z 0.071z 0.009z 0.309z 0.366z 0.420z emphysema 0.040z 0.105z 0.218z 0.389y 0.400 0.344y cerebralhaemorrhage 0.170z 0.014z 0.371z 0.702z 0.722z 0.657y infarction 0.971z 0.882z 0.814z 0.872z 0.857z 0.494z depressivedisorder 0.370 0.421 0.425 0.438 0.393 0.365 otherchronicaldisease 0.507 0.552 0.613 0.631 0.521 0.437 highbloodpressure 0.435 0.535 0.560 0.524 0.425 0.294 chronicpain 0.218 0.270 0.341 0.376 0.368 0.360 diabetes 0.465 0.368 0.392 0.415 0.372 0.380 asthma 0.217y 0.250 0.257 0.311 0.273 0.278 stress 0.808 0.730 0.724 0.664 0.560 0.427 smoker -0.136 -0.140z -0.124 -0.105y -0.072z -0.012z meals 0.178y 0.142 0.130z 0.135z 0.118z 0.057z Socioeconomic and demographic variables householdsize -0.036y -0.041 -0.046 -0.053 -0.045 -0.024z age -1.326z -1.285z -1.263z -2.047z -2.445z -1.387z age2 0.372z 0.350z 0.343z 0.597z 0.765z 0.441z age3 -0.033z -0.030z -0.030z -0.056z -0.077z -0.046z age*female 1.639z 2.061z 1.851z 2.295z 2.049z 1.084z (age*female)2 -0.552z -0.671z -0.605z -0.744z -0.690z -0.373z (age*female)3 0.054z 0.065z 0.059z 0.072z 0.070z 0.039z female -0.989z -1.471z -1.242z -1.610z -1.391z -0.648z educmax 0.022 0.023 0.024 0.024 0.017 0.008z lincome 0.120 0.116 0.113 0.100y 0.068 0.034z single -0.185 -0.177 -0.194 -0.237 -0.234 -0.166 student -0.063z -0.068z -0.065z -0.040z 0.003z 0.018z retired 0.216z 0.260z 0.287z 0.433z 0.345z 0.221z Notes: The subsample has 12,637 observations. Results were obtained with 1500 jittered samples. Coe� cients marked with z and yare not signi�cant at a 5 and 1 per cent level, respectively. Geographic and seasonal controls not reported. Available from the authors upon request. 41 Table A6: Quantile regression results: estimated semi-elasticities when 18<=age<=45 SE(0.25) SE(0.50) SE(0.60) SE(0.70) SE(0.80) SE(0.90) Health insurance status variables pubsub 0.066 0.062 0.059 0.058 0.047 0.015 privsub 0.051 0.088 0.113 0.141 0.137 0.148 Health status variables sick 0.077 0.180 0.272 0.324 0.347 0.469 limitdays 0.041 0.038 0.039 0.042 0.050 0.056 limited -0.011 0.026 0.034 0.073 0.163 0.325 rheumatism 0.072 0.060 0.071 0.090 0.120 0.163 osteoporosis 0.171 0.185 0.179 0.180 0.181 0.179 cancer 0.198 0.147 0.144 0.110 0.124 0.316 kidneystones 0.039 0.047 0.085 0.151 0.219 0.297 renalfailure 0.012 0.020 0.002 0.109 0.173 0.267 emphysema 0.011 0.030 0.067 0.143 0.193 0.210 cerebralhaemorrhage 0.049 0.004 0.124 0.306 0.415 0.475 infarction 0.436 0.379 0.346 0.418 0.531 0.327 depressivedisorder 0.119 0.140 0.146 0.165 0.189 0.225 otherchronicaldisease 0.176 0.197 0.233 0.265 0.268 0.280 highbloodpressure 0.145 0.189 0.207 0.207 0.208 0.175 chronicpain 0.065 0.083 0.112 0.137 0.174 0.221 diabetes 0.157 0.119 0.132 0.155 0.177 0.236 asthma 0.065 0.076 0.081 0.110 0.123 0.164 stress 0.331 0.288 0.293 0.283 0.295 0.272 smoker -0.034 -0.035 -0.032 -0.030 -0.027 -0.006 meals 0.052 0.041 0.038 0.043 0.049 0.030 Socioeconomic and demographic variables householdsize -0.010 -0.011 -0.013 -0.016 -0.018 -0.012 age when male 0.001 0.001 0.001 0.002 0.003 0.001 age when female -0.005 -0.005 -0.005 -0.006 -0.005 -0.003 female 0.124 0.128 0.138 0.178 0.185 0.175 educmax 0.006 0.006 0.007 0.007 0.007 0.004 income 0.005 0.005 0.005 0.005 0.004 0.003 single -0.045 -0.043 -0.049 -0.063 -0.082 -0.078 student -0.016 -0.018 -0.017 -0.012 0.001 0.009 retired 0.064 0.079 0.092 0.163 0.162 0.126 Notes: The subsample has 12,637 observations. Results were obtained with 1500 jittered samples. The marginal e�ects are calculated for a vector ex containing the mean value of the continuous (and count) variables and zeros for the dummy variables. The classi�cation of the covariates is presented in Table 2 and the mean values care computed for this particular sample. Geographic and seasonal controls not reported. Available from the authors upon request. 42