Voter Turnout in Portugal: A Geographical Perspective

Abstract The decline of voter turnout in Portugal was confirmed in the legislative election of 2015. The unquestionable democratic value associated with the act of voting, leads to the discussion of this issue, and emphasizes the need for additional investigation. Particularly, it is crucial to identify the characteristics of citizens who vote, to better understand the phenomenon and think about solutions. This work identified the most significant sociodemographic variables in explaining voter turnout in continental Portugal and described the relationship between those variables and voter turnout, including the geographical variation existing across the municipalities. A Semiparametric Geographically Weighted Regression (SGWR) model enabled the investigation of local variations in turnout values, simultaneously considering that its relationship with some variables might vary over space. Results show that turnout is influenced by a set of sociodemographic variables. While some variables affect turnout differently over the country (percentage of family cores with children aged less than 15, and percentage of owner-occupied houses), others affect it uniformly (percentage of graduated residents, percentage of classic families, and distance to Lisbon or Oporto – the nearest). These results suggest the use of a semiparametric approach to better understand turnout and for further research on voting issues.


Introduction
Voter turnout, the total number of eligible voters who participate in an election, is a fundamental issue in democracy and, although it is not the only form of political participation, its importance is undeniable (Ribeiro, Borba, and da Silva 2015;Franklin 2004;Douglas 2013;Aldrich 1993;Fornos, Power, and Garand 2004). The act of voting is directly related with the concept of democracy and provides the opportunity of political equality between citizens (Kostadinova and Power 2007;. Voting strongly influences society, since political power is assigned through it and, in that sense, levels of turnout can provide information about the state of democracy (Franklin, 2004;. Empirical studies on this issue revealed a set of variables that are related to voting and that are not necessarily common to all countries (Cancela and Geys 2016;Geys 2006). Consensual is the fact that turnout has declined in most democracies in the last decades (Gray and Caul 2000;Hooghe and Kern 2017;. There have been a number of studies focusing on the causal determinants of voting behavior, concerning consolidated democracies but also newly democratic regimes (Geys 2006;Finkel 1985;Ribeiro, Borba, and da Silva 2015;Fornos, Power, and Garand 2004). However, voting is not yet well understood by politics scholars, despite its relevance (Aldrich 1993). According to this author, turnout is affected not only by election specific variables, but also by attitudinal and demographic variables and, therefore, it is difficult to explain who votes in an absolute way. In fact, as Feddersen (2004) notices, in large elections, the probability that an individual vote might change the election outcome is very small, which easily would explain why people do not vote when facing a little obstacle. However, other people make an effort to participate, which allows for build or reinforce an individual political attitude (Finkel 1985).
More recently, voter's local contexts have been considered in the analysis of electoral participation, showing that political behavior is unevenly distributed across space (Cho and Gimpel 2009;Kavanagh et al. 2006;Taiwo and Ahmed 2015;Mansley and Demšar 2015;Pattie et al. 2015). For instance, Kavanagh et al. (2006) analyzed the influence of variables such as social class, education level and gender in election turnout using ordinary least squares (OLS), but the proposed models proved to be ineffective, pointing out the limitations of this technique when the relationships between variables vary across space. Going further, a technique that considers this variation was applied, improving results -geographically weighted regression (GWR) (Fotheringham, Brunsdon, and Charlton 2002). Also, in Nigeria, using GWR, it was possible to identify best predictors of voter apathy in presidential elections between 1999 and 2011 (Taiwo and Ahmed 2015), showing that this can be a useful tool to understand voting behavior. This idea is yet defended by Mansley and Demšar (2015) when analyzing turnout of a London's local election. Fotheringham, Brunsdon, and Charlton (2002) discuss GWR's utility, suggesting that sometimes in social sciences the measurement of a relationship depends on where the measurement is taken, so a global model can misrepresent reality. GWR attempts to capture this local variation, emphasizing differences across space. Given the expected variation of turnout between countries or within a single country, GWR seems to be suitable to the study of the phenomenon. One extension of GWR is semiparametric weighted regression (SGWR), which includes both varying and fixed coefficients (Fotheringham, Brunsdon, and Charlton 2002;Nakaya et al. 2009). Combining spatial stationarity and non-stationarity in the same model leads sometimes to a better model fit, since the existence of some fixed coefficients can reduce the complexity of local relationships (Nakaya 2015).
The low levels of voter turnout are an actual problem in Portugal. In the first legislative election, in 1976, the percentage of voters was 83.3%, while this percentage was 55.9% in the election of October 2015 (PORDATA 2019). Kostelka (2017) explains voter decline in new democracies, saying that it occurs in countries where the democratization process was driven by the opposition. The founding elections are highly mobilizing and after that turnout tends to decrease progressively, until it reaches the standard level. The declining of turnout is usually considered to be unwelcome for democracy, because it is associated with a bad perceived state of the nation by citizens (Franklin, 2004). It can indicate state stability and trust in people that apply to elections, but it also can indicate a lack of agreement between citizens' preferences and the available options. In the latter case, citizens are questioning the system itself and the political parties .
In Portugal, there is a general lack of interest concerning political issues (Canas 2004). Attitudes related to political interest, identification to and reliance in political parties have been considered when trying to explain electoral participation in the country . According to these authors, in the presence of variables that measure political engagement, these are the main reasons that contribute the most to explain electoral participation. However, studies about Portuguese electoral behavior are still disperse and disorganized, despite Portugal being a consolidated democracy, with an experience comparable with countries such as Italy, Germany or France (Jalali 2003). This makes hard the comparison with other different countries, and it is a barrier to understand the factors related to absenteeism. It seems quite evident that more research on the topic is needed within the Portuguese context. Understanding what influences this phenomenon gives the possibility of discussing it and think about solutions based on evidence.
According to the Constitution of the Portuguese Republic, every citizen older than eighteen years is eligible to vote, except in the cases identified by general law (Law nr. 14/79, from 16th of May). Voter registration is an automatic process since 2008 and, although voting is a civic duty, it is not mandatory. Looking at the lower levels of voter turnout in Portugal and thinking that voting is the foundational concept of a democratic structure (Douglas 2013), it is important to understand the reasons behind absenteeism. Hence, this work seeks to identify the most significant variables in explaining voter turnout in continental Portugal, among a group of objective variables. Specifically, we aim to: (i) Study the relationship between those variables and voter turnout through the application of regression methods; and (ii) Describe how the relationship between sociodemographic variables and voter turnout varies across the country. This study will provide insights about political participation and help to rethink the institutional factors that can contribute to change the actual scenario, such as compulsory voting and/or voting facilitating conditions that may vary between municipalities.

Variables influencing voter turnout
Investigating persistence in voting turnout, Denny and Doyle (2009) tried to understand the extent to which it is driven by habit, concluding that an individual who voted in the previous election is more likely to vote in the current election. Also, Plutzer (2002) focuses in habit forming, presenting a theory for the evolution of voter's political behavior. His analysis includes parental influence on initial turnout and growth, as well as the impact of life events on turnout growth.
Studies developed in several countries have been relating voter turnout to socioeconomic, political and institutional variables. Focusing on the effects of these sets of variables, Geys (2006) and Cancela and Geys (2016) review and assess empirical work where the dependent variable is voter turnout (or absenteeism). Blais (2006) also reviews studies in the field in order to verify which statements about the causes of cross-national variations in turnout are supported by empirical evidence. Despite the dominant view that cross-national variations in turnout can be mostly explained by institutional factors (such as compulsory voting and rules designed to facilitate voting), Blais (2006) states that the understanding of the impact of institutions on turnout is unsteady and is conditioned by the presence of other factors, particularly those that differ from one election to another. In accordance, Hooghe and Kern (2017) state that institutional variables are not enough to explain the curve of the decline on turnout levels. Blais (2006) emphasizes the need to explore socioeconomic variables to achieve a better understanding of voter turnout. To understand if the variables explored among the industrialized democracies are relevant to countries with emergent democracies, Fornos, Power, and Garand (2004) and Ribeiro, Borba, and da Silva (2015) analyze voter turnout in Latin America. Referring to the period between 1980 and 2000, the first authors find that political and institutional variables are more relevant than socioeconomic variables. More recently, using data from 2009, the latter conclude that both types of variables affect voter turnout in Latin America.
From the set of variables presented in literature, education, age, gender, marital status, home ownership and economic measures have been largely studied. Focusing on the impact of education, Gallego (2010) compares the level of unequal participation in advanced industrialized democracies, verifying that inequalities are present in countries where education and voting are strongly related. Plutzer (2002) emphasizes education attainment when considering changeable characteristics of individuals' life, and concludes that education has a significant effect on turnout and that its contribution extends and cumulates over individuals' lifetime. As Huckfeldt and Sprague (1992) demonstrate, individuals with higher education attainment are more likely to vote, among the ones lying outside the boundaries of conventional party involvement. Yet considering the universe of independent voters, those who are informed are more likely to decide an election, since abstention by uninformed occurs at a level that balances out the votes of partisans (Feddersen and Pesendorfer 1996). Schlozman, Burns, and Verba (1994) analyze gender differences concerning to several kinds of civic activity, concluding that men are in general more active in politics than women. Going further, they find gender inequalities in the factors that facilitate participation (i.e., time, money and civic skills), which reflect men and women's roles and experiences in family, workplace and society.
The traditional gender gap, characterized by women being more conservative than men and less likely to participate in politics, is also examined by Inglehart and Norris (2000). Comparing sixty countries around the world, the authors conclude that women's electoral behavior have been changing and explore the role of structural and attitudinal factors behind this change.
The relationship between age and turnout has been emphasized through a life cycle explanation, which relates low youth turnout with the presence of personal worries specific to this life period. Hence, in this scenario, participation increases with growth and the adoption of typical adult roles (Highton and Wolfinger 2001). An opposite explanation concerning age, commonly named generational effect, considers that low turnout is a permanent characteristic of a whole generation that does not change with experience and integration (Wass 2007;Blais et al. 2004;Lyons and Alexander 2000).
Marriage and home ownership influence individuals participation in elections (Plutzer 2002). Some studies identify positive influences of both factors (Squire, Wolfinger, and Glass 1987;Denny and Doyle 2009), since homeowners are more likely to feel connected to their community and, on the other hand, married people can share with the spouse the time and energy learning bureaucratic issues related to voting. Other studies present negative contributions of marriage (Stoker and Jennings 1995;Highton and Wolfinger 2001) and home ownership (Highton and Wolfinger 2001) to voter turnout.
Studies that focus on economic variables, with respect to electoral participation, use objective measures, such as unemployment rates, but also employ subjective perceptions of the economic situation, both inside and outside the country. Voter's response is influenced either by voter's personal economic expectations (Sanders 2005), or by the opinions of those that live among (Pattie et al. 2015). Whatever the opinion about economic performance is, economic indicators explain much of the variability in government support, which can be related to its changeable nature (Lewis-Beck and Stegmaier 2000).

Studies concerning the Portuguese context
Voter turnout varies across countries (Gallego 2010;Blais 2006;Jackman and Miller 1995) and over time (Blais 2006). Empirical studies on this issue revealed a set of variables that are related to voting and that are not necessarily common to all countries (Geys 2006;Cancela and Geys 2016). Consensual is the fact that turnout has declined in most democracies in the last decades (Gray and Caul 2000;Hooghe and Kern 2017), Portugal being no exception. The results of the legislative and presidential elections in 1999 and 2001, respectively, brought to light the issue, increasing the critical thinking and the debate .
Concerning the Portuguese context, socioeconomic variables have also been identified, namely age Magalhães 2008), education (Magalhães 2008; and economic measures (Freire and Santana-Pereira 2012;Freire and Lobo 2005;Gunther and Montero 2001). Voter's contexts can help explaining turnout (Gallego 2010), since the relation between socioeconomic status and absenteeism varies between countries (Nevitte et al. 2009). To some extent, the same idea can be thought to local contexts considering a single country.
The high levels of abstention in Portugal in the last decades have raised concerns related to citizens' participation , enhancing the need for studies addressing the reasons behind this phenomenon (Jalali 2003). Many of the studies concerning the Portuguese context focus on party identification of citizens and the existence of social and religious divides (Jalali 2003). However, it is also possible to identify some works including individual characteristics of citizens ( Particularly, the economic perspective of voting has been addressed by several authors (Duch and Stevenson 2006;Martins 2010;Nunes 2005;Freire and Santana-Pereira 2012;Freire and Lobo 2005;Gunther and Montero 2001), which is in accordance with the growing impact of economic factors found in other democracies (Freire, Lobo, and Magalhães 2004).

Methodology
A quantitative design is taken, since the understanding of the phenomenon is based on empirical evidence. Moreover, sociodemographic indicators are chosen based on literature, being the relationship between those indicators and turnout analyzed in detail, which is one of the characteristic of quantitative studies as well as the possibility of generalizations (Creswell 2003). Data and analysis are detailed along the following sections, while Figure 1 provides an overview of the methodological framework.

Data
This work uses data from the population census 2011 (INE, 2015). Since the last census is temporally close to the 2011 elections, the dependent variable is based on voter turnout in the same year (CNE, 2015). The dependent variable is the percentage of votes, computed as the rate between the number of votes and the number of registered citizens ( Figure 2).  The asymmetries that exist in Portuguese territory between urban and rural areas support the decision of working with percentages. The choice of explanatory variables was based on the literature review aligned with available census data. The intention was to cover as much as possible the topics identified on literature and that are, at some extent, available on population census: population concentration and urbanization, age, gender, marriage and children, educational attainment, population stability and homeownership, unemployment.
New variables were computed from the ones available in the census data set, leading to a set of 25 independent variables measured at municipality level to be investigated in regression models. For population concentration and urbanization, the ratios were obtained using geographic areas. For instance, the percentage of classic families is the ratio between the number of families and the respective area. In the case of individual characteristics, such as age, the ratios were obtained using the total number of individuals living in that region. A detailed description of each of the variables is presented in Table 1.
In all analyses, municipality is the territorial unit. The use of aggregate level data for empirical analysis of voter turnout is suggested by Matsusaka (1995), instead of individual data, since individual idiosyncrasies can cancel each other. Also, Kavanagh et al. (2006) highlighted the potential of these type of data for spatial analysis of variations in the dependent variable. However, they point out the small number of variables available from census as an inherent limitation of using aggregate data, even though new ones can be drawn from available data, which has been done in this work. Moreover, the quality of results when studying turnout has been widely discussed, since electoral register can be outdated, for example containing citizens that are already dead or that have moved .

Ordinary least squares
Data analysis starts with the standard modeling approach, i.e., OLS modeling, whose parameter estimates show the «national picture» of the relationship between turnout and each of the independent variables. OLS is a widely used method of regression analysis, but it relies on a set of assumptions that spatial data often violates (Fotheringham, Brunsdon, and Charlton 2002). Using several combinations of the 25 independent variables, OLS was continuously performed and diagnostic tools were applied, including several statistical tests at the 5% significance level (otherwise stated).
OLS provides the key explanatory variables, however its results might not be representative of the situation happening in any particular region and may hide local differences important to explain voter turnout (Fotheringham, Brunsdon, and Charlton 2002). Considering the complexity of the relationship between voter turnout and the explanatory variables, it is expected that they can vary geographically. In addition, in Portuguese legislative elections, although deputies represent people at the national level, they are chosen through electoral circles. Hence, to a certain extent, there is a regional component in people's choice, so analyzing voter turnout across Portuguese regions is likely to provide valuable insights. For each combination of variables, looking at model diagnosis, data show spatial nonstationary, which confirms the idea Percentage of family cores with children aged less than 15 P_nucle_F2 Percentage of family cores with children aged more than 15 P_homens Percentage of males P_Mulheres Percentage of females P_individi_3 Percentage of residents aged between 15 and 19 P_individi_4 Percentage of residents aged between 20 and 24 P_individi_5 Percentage of residents aged between 25 and 64 P_individi_6 Percentage of residents aged more than 64 P_indivi_re Percentage of iliterate residents P_ind_res Percentage of residents with completed 1 st stage of basic education P_ind_res1 Percentage of residents with completed 2 nd stage of basic education P_ind_res2 Percentage of residents with completed 3 rd stage of basic education P_ind_res3 Percentage of residents with completed secondary education P_ind_res4 Percentage of residents with post-secondary education P_ind_res5 Percentage of graduated residents P_ind_res6 Percentage of employed residentes P_Famili1 Percentage of classic families without unemployed P_ind_res7 Percentage of residents who study in the same municipality P_ind_res8 Percentage of residents who work in the same municipality P_res_hab Percentage of owner-occupied houses that the use of a single coefficient describing the relationship between each explanatory variable and turnout is an oversimplification of the variability in local relationships (Cho and Gimpel 2009). Based on the comparison of the Adjusted R 2 and the AICc, the most robust of the OLS models was chosen, being its explanatory variables used to go deep into the analysis.

Geographically weighted regression
Using the aforementioned data exhibiting spatial nonstationarity, a geographically weighted regression (GWR) was applied, since this method might be especially helpful with such data (Fotheringham, Brunsdon, and Charlton 2002). Considering the spatial variations that occur in the relationship between turnout and the explanatory variables, GWR provides an opportunity to better explain this phenomenon, since this technique takes into account spatially varying relationships, differently from the traditional global regression models (Fotheringham, Brunsdon, and Charlton 2002). The parameter estimates obtained through OLS are global, and variations that might exist between the explanatory variables and the dependent variable are expressed in the error term. In GWR, the parameter estimates are local since they are calculated for each municipality in the dataset. GWR uses the coordinates of the centroid of each municipality as a target point to estimate a form of spatially weighted least squares regression. Here, the influence of neighboring observations on the parameter estimates for each point is determined by a weighting function based on the distance of each observation to that point. Consequently, observations closer to a point have a greater influence on the parameter estimates for that point than observations that are further away from it (Fotheringham, Brunsdon, and Charlton 2002).
In this study, the spatial weighting function is an adaptive bi-square kernel, which adjusts over space, according to the number of points found, that is, the bandwidth of the kernel varies with the number of neighbors (Nakaya et al. 2016). The optimal bandwidth is automatically defined by the software, ArcGIS for Desktop (ArcMap 10.5), being the criterion chosen the minimum value of AICc. Being flexible, the bandwidth of the spatial kernel is different for regression points located in urban or rural parts of the country, which is a relevant issue regarding voter turnout (Kavanagh et al. 2006). Since the data are densely clustered inside urban areas, the bandwidth is smaller in these areas, while it is larger in data scarce areas.
The use of the same measures of goodness of fit -Adjusted R 2 and AICc -allows for a comparison between the final GWR model with the model obtained through OLS. For Fotheringham, Brunsdon, and Charlton (2002), the AICc approach provides a useful tool for choosing models in GWR context, enabling the comparison between a GWR model and an OLS model, and also the comparison between a number of competing GWR models. In both cases, the best model is the one with the smallest AICc value.
The use of Global Moran's I statistic allows to assess the possible existence of spatial autocorrelation in the residuals of the GWR model. Evidence of spatial autocorrelation indicates that the GWR model might not be appropriate because important explanatory variables may be missing. According to Rosenshein and Scott (2012), spatial explanatory variables are key to find a properly specified model, such as those related to distance. It is common to find different spatial patterns of the residuals in the more industrialized areas and in rural areas, attesting that the model is under or overpredicting turnout. Therefore, three new variables were drawn, related to the straight-line distance between each municipality and other places -distance to Lisbon or Oporto (the nearest), distance to the coast, and distance to the district main city. It is important to note that Lisbon and Oporto are the two largest cities of Portugal, and coastal areas are more densely populated than inland.

Semiparametric geographically weighted regression
One important extension of GWR is the use of semiparametric models (Fotheringham, Brunsdon, and Charlton 2002), which combine terms of varying coefficients -influenced by location -and terms of global coefficients -not affected by location (Nakaya et al. 2009;Nakaya et al. 2016). Therefore, the last modeling stage investigated the use of semiparametric geographically weighted regression (SGWR) models. SGWR is available on GWR 4.0, the software used from this point on, together with ArcGIS.
Considering that some sociodemographic indicators can be spatially stationary, it is suitable to use semiparametric models. Nakaya (2015) states that this technique reduces the complexity of local relationships, improving the interpretation of the geographical variations, and might improve the predictive performance of the model. SGWR can be associated with model diagnosis to evaluate the geographical variability of the variables (Nakaya et al. 2009). In fact, besides the usual model comparison criteria, such as AICc, for different combinations of local and global variables, GWR 4.0 contains two techniques for automated variable selection, Local-to-Global (LtoG) and the reverse procedure, Global-to-Local (GtoL). LtoG variable selection routine makes a model comparison between the original GWR model, where all variables vary in space, with models where only one variable is constant. If GWR is the best model, the process stops. Otherwise it continues, now being the «original» model the one with a global variable and all the others being local, and so on, until no improvements in the model can be obtained by fixing variables' parameters (Nakaya et al. 2009;Nakaya et al. 2016).
SGWR started with the variables that were used for GWR, including the distance variables mentioned before. Independent variables standardization and the LtoG variable selection routine were implemented. The options taken for GWR were repeated for SGWR, namely the use of an adaptive bi-square kernel and the finding of the optimal bandwidth based on the AICc. Again, the comparison between the GWR and SGWR models was based on the AICc.

Ols and variables selection
Combining the 25 independent variables derived from the census and computed for this study, several OLS regressions were run to find the subset of variables with the highest explanatory power and to avoid multicollinearity issues. The following variables were selected to go further into the analysis: • Percentage of family cores with children aged less than 15 (P_Nucle_F1); • Percentage of graduated residents (P_Ind_Res5); • Percentage of classic families (P_Famili); • Percentage of owner-occupied houses (P_Res_Hab). Table 2 presents the results of the OLS model chosen, i.e., the one with the smallest AICc and the higher Adjusted R 2 , between the possible models verifying the diagnostic analysis. Results show that these predictors do not exhibit multicollinearity (VIF < 7.5) and that they all are statistically significant, according to the robust t-test (p-value < 0.05). All the variables are positively associated with election turnout. Observing OLS model diagnostic values (Table 3), it is evident the limited explanatory power associated with this model, with the four explanatory variables accounting for 16% of the variability of turnout. Nevertheless, the model has overall significance (Joint Wald statistic's p-value < 0.05). Diagnostic results also show the normal distribution of the residuals (Jarque-Bera test's p-value > 0.05). However, there is evidence that residuals are heteroscedastic (Koenker test's p-value < 0.05), which indicates the presence of nonstationarity, and exhibit spatial autocorrelation (Globan Moran's I test's p-value < 0.05). These results thus justify the need for a spatial regression model. In accordance, in the residuals map ( Figure 3) it can be observed the existence of clusters of negative errors in the north, especially in the eastern part, and in the south interior (eastern Alentejo). Clusters of positive errors are found in the center, in the south coast and in Oporto metropolitan area.
The OLS model is unreliable in the presence of nonstationarity and spatial autocorrelation of the residuals. Fotheringham, Brunsdon, and Charlton (2002) argue that GWR models are a suitable alternative to deal with these problems, when compared with other spatial regression models.

GWR
The four GWR models that were estimated included the predictors of the OLS model, and three of them included distance-based variables (Table 4): • Distance to the district main city (Dsededist); • Distance to the cost (Distcost); • Distance to Lisbon or Oporto -the nearest (DLxPorto).
The models obtained using GWR have a higher explanatory power than the OLS model (Table 4), since all of them have higher values of Adjusted R 2 . There is also a reduction in the AICc value for all GWR models. In models GWR 1 and GWR 2, it is notable the improvement in the measures of goodness of fit. Contrary to the expected, the inclusion of a distance-related variable in the model does not result in an improvement in models' performance.
The values of local R 2 vary between 0.03 and 0.73 in all territory, being stronger in the northwest and weaker in the center. The condition number values are smaller than 30 for all the municipalities, thus there is no evidence of local multicollinearity among the explanatory variables. However, the global Moran's statistic of GWR models shows the presence of spatial autocorrelation. The residuals map of Model GWR 1 (Figure 4) confirms that residuals do not exhibit a random distribution pattern. Although the larger residuals are more dispersed over the study area, clusters of negative and positive errors can be found. Clusters of positive errors are still observed in the center, as well in municipalities near to Lisbon and Oporto. Also, clusters of negative values are still observed in the north and in the south interior. The summary statistics of varying (local) coefficients of the GWR1 model are available in Table 1 of Annex 1.
Having the ability of combining variables with coefficients that vary geographically and variables whose coefficients are constant all over the space, semiparametric models are more flexible and can reduce the complexities of local relationships (Nakaya 2015). Such extension of GWR was used for further investigating the causal factors of voter turnout.

SGWR
The explanatory power is higher for the models obtained through SGWR (Table 5) than for the previous models (Table 4), as SGWR models exhibit higher values of the Adjusted R 2 and lower values of the AICc. Here, the inclusion of a distance-related variable has a very positive contribution for model's performance. Model SGWR 2 attains the lowest AICc, closely followed by model SGWR 4. There is evidence of significant spatial autocorrelation in the residuals from all the models (Globan Moran's I test's p-value < 0.05), except from model SGWR 4. Hence, this model is considered the best one in this study. The summary statistics of varying (local) coefficients of the SGWR4 model are available in Table 2 of Annex 1. The residuals from the SGWR models are less autocorrelated (Global Moran's I statistic equal to 0.05 for the chosen model) than those from the OLS model (Global Moran's I statistic equal to 0.44) and from the GWR model (Global Moran's I statistic equal to 0.12), as expected. The reduction in the degree of spatial autocorrelation through SGWR can be seen if we compare the maps of the standardized residuals from the three models (Figures 3, 4 and 5): the SGWR's residuals map exhibit a more random pattern than the others.
The local R 2 values ( Figure 6) highlight the geographical variability of some variables and suggest the misspecification of the model in the regions where those values are lower. The lowest  values of local R 2 are observed in the north interior (Trás-os-Montes e Alto Douro), in the south center (Ribatejo, Alentejo Central e Alto Alentejo) and in some municipalities of the south cost (Alentejo Litoral). The maps of SGWR coefficients and standard errors (Figures 7-10) should be interpreted with caution. The map of the SGWR standard errors provides important information for conclusions drawn on the local relationships between the dependent variable and each predictor, because it allows identifying the areas where the model underpredicts and overpredicts voter turnout. Percentage of graduated residents (P_Ind_Res5), percentage of classic families (P_Famili) and distance to Lisbon or Oporto -the nearest (DLxPorto) are global variables, i.e., the corresponding parameters do not vary in space. Table 6 shows that two of these variables, percentage of graduated residents (P_Ind_Res5) and percentage of classic families (P_Famili), are positively associated with turnout, while distance to Lisbon or Oporto -the nearest (DLxPorto) is negatively associated with turnout.
Conversely, percentage of family cores with children aged less than 15 (P_Nucle_F1) and percentage of owner-occupied houses (P_Res_Hab) vary geographically. Figures 7 to 10 show the distribution of the coefficients and the distribution of the standard errors.  Figure 7 shows that the percentage of family cores having children aged less than 15 (P_ Nucleo_F1) has a positive impact in the north, with exception for some municipalities close to Oporto (Vila do Conde, Trofa, Maia, Matosinhos, Santo Tirso) and for the main part of the municipalities of Bragança District. Moreover, its effect is positive in some municipalities of Lisbon metropolitan area (the ones located in the north side of Tejo river, with some exceptions -Lisbon city, Amadora, Oeiras, Odivelas) and in some municipalities of West Region, belonging to Leiria district but close to Lisbon. Contrary, the impact of having children with less than 15 is negative in the south, being stronger in its interior. This negative impact is also observed in the interior north, with exception for the municipalities located at north of Guarda district.
Regarding the distribution of the standard errors (Figure 8), the model brings more trust for the parameter estimates in the most southern regions (Algarve and Baixo Alentejo) and in the northern regions (Viana do Castelo, Vila Real and Bragança districts), with exception for some interior municipalities. Less trust is given to the parameter estimates in a central area (Alto Alentejo and Ribatejo). Nonetheless, the standard errors are close to zero in all municipalities, thus all coefficient estimates of this variable may be considered reliable.
The percentage of owner-occupied houses (P_Res_Hab) is positively associated with turnout in almost all north region, according to Figure 9. Here, the highlight is a central area, including essentially municipalities of Braga, Porto, Viseu and Guarda districts. In the rest of the country, this positive influence is hardly notable, existing only in a coastal area which includes Lisbon district and other that are geographically close to Lisbon (Leiria and Santarém).
The effect of owner-occupied houses variable is close to zero in almost all south region. There is a central area, which includes essentially municipalities of Castelo Branco, Coimbra, Santarém, Évora and Portalegre districts, where this variable has a weak negative effect. For this variable, Figure 10 shows that the model brings more trust for the parameter estimates in the most southern regions (Algarve and Baixo Alentejo), although the standard errors are close to zero in all municipalities, similarly to what happened for the percentage of family cores with children aged less than 15. The northeast part of the country, as well as the most interior municipalities in the north (from Castelo Branco, Guarda and Viseu districts) are the regions where the estimates are less reliable. Yet, the model has a low reliability in Alto Alentejo and Ribatejo regions (Évora, Portalegre and Santarém districts).

Discussion
This study contributes to the identification of socioeconomic variables related to voter turnout in Portugal and examines how the relationship between those variables and voter turnout varies across the country. Additionally, this research contributes to fill the lack of studies using quantitative methods, particularly GWR and SGWR, in Portugal regarding this phenomenon.
Overall, the results provide evidence that turnout is a complex process that cannot be satisfactorily explained with a model that does not consider locations. On the other hand, there is also a set of variables that do not vary spatially, which highlights the importance of using SGWR -a method which captures the spatial stationarity and nonstationarity in the same model (T. Nakaya 2015).
Using OLS regression, the percentage of family cores with children aged less than 15, graduated residents, classic families and owner-occupied houses are identified as possible significant predictors of voter turnout, all of them positively influencing the phenomenon. Earlier research finds that variables related to population stability (home ownership, residential mobility, residential stability) have impact on turnout (Squire, Wolfinger, and Glass 1987;Denny and Doyle 2009;Highton and Wolfinger 2001;Plutzer 2002), which is in accordance with the inclusion of the variable «percentage of owner-occupied houses» in the model. Given the existence of studies concluding that educational attainment is associated with voting rates (Gallego 2009;Lyons and Alexander 2000;Blais et al. 2004;Huckfeldt and Sprague 1992), the inclusion of a variable related to education -in the case, «percentage of graduated residents» -corresponds to the expected. Yet, the presence of the variable «percentages of family cores with children aged less than 15» is also in line with previous studies, which analyze the impact of having children (Denny and Doyle 2009;Plutzer 2002). Finally, «percentage of classic families» is seen as a measure of population concentration, which is fairly explored in literature (Cancela and Geys 2016;Mansley and Demšar 2015;Fornos, Power, and Garand 2004;Kostadinova and Power 2007). Although OLS regression exhibits a positive relationship between each one of the explanatory variables and turnout, literature includes both cases of positive and negative relationships.
The explained variability increases when passing from the global to the local models, as verified in other studies that take this methodological approach (Kavanagh et al. 2006;Taiwo and Ahmed 2015;Mansley and Demšar 2015;Cho and Gimpel 2009;Pattie et al. 2015). However, and differently from those authors, who achieved the best model using GWR, in this study the best model is only found when a semiparametric approach is taken. Regarding turnout, and as far as is known, semiparametric models have not been used, because usually studies stop with the application of GWR regression. Usually, the problems associated with the use of OLS in the presence of nonstationarity are no longer an issue when applying GWR regression.
The SGWR model contains three global variables (percentage of graduated residents, percentage of classic families and distance to Lisbon or Oporto -the nearest) and two local variables (percentage of family cores with children aged less than 15 and percentage of owner-occupied houses). Percentage of graduated residents (P_Ind_Res5) and percentage of classic families (P_Famili) have a positive relationship with turnout, the same suggested by OLS regression, while distance to Lisbon or Oporto -the nearest (DLxPorto) is negatively associated with turnout. For percentage of graduated residents, this positive relationship is confirmed by most of the studies. A higher educational attainment is related to a higher propensity to vote, and college attendance contributes to boost initial turnout (Plutzer 2002). Considering abstention, voting rates are low among citizens with low levels of educational attainment (Lyons and Alexander 2000;Blais et al. 2004;Gallego 2009;Martins 2010) and leaving school takes to a decrease in turnout (Huckfeldt and Sprague 1992). Contrarily, there are studies, one of them regarding Portuguese context, where a negative association between lower turnout and third level education is revealed (Kavanagh et al. 2006). The positive impact of having a degree can be related to the access and ability to interpret information (Fornos, Power, and Garand 2004). In fact, educational attainment is used as a measure of ability to interpret information (Jalali 2003), but it should not be considered detached from other measures (Denny and Doyle 2008).
Overall, studies including a measure of population concentration find no evidence regarding the influence of that variable in turnout (Cancela and Geys 2016;Fornos, Power, and Garand 2004;Kostadinova and Power 2007). Only Mansley and Demšar (2015) find a positive relationship between population density and turnout. It is important to remember that their study respects to a city's election, not a national election.
Distance to Lisbon or Oporto -the nearest (DLxPorto) is negatively associated with turnout, suggesting that in the interior of the country turnout is lower, since there is a greater distance to the main cities, both located in the coast. This variable can be though as a measure of urbanization, as well as the variable «percentage of classic families». In turn, urbanization is associated with a greater exposure to information and a greater mobility, two factors that increases participation (Fornos, Power, and Garand 2004). It would mean that turnout would be higher for citizens living in Lisbon or Oporto metropolitan areas. This is not a surprising result, essentially because it refers to a legislative election. Maybe different results would be found in municipal elections, where specific contextual factors and political preferences would probably influence turnout in different ways. The role of peri-urbanization in urban patterns (Rivière et al. 2012;Davezies et al. 2013) should be addressed in future research. Remembering the study of (Martins 2010), the impact of economy on voting is different in national and local government elections, which suggests that this could happen with other predictors. Percentage of family cores with children aged less than 15 (P_Nucle_F1) and percentage of owner-occupied houses (P_Res_Hab) are local variables and influence turnout in different ways along the country. With some exceptions, the percentage of family cores with children aged less than 15 has a positive influence in the coastal north where associativism is stronger, so the sense of belonging and involvement in the local community is also stronger. Religious traditions, which end up increasing involvement in the community, are also more present in the north of the country.
We can also find a positive influence of this variable in Lisbon metropolitan area and in the west region, close to Lisbon. Interestingly, some municipalities belonging to Oporto or Lisbon metropolitan areas constitute exceptions, including Lisbon municipality. In the south, the influence of the percentage of family cores with children aged less than 15 is negative, as well in the interior north. These are reliable results since standard errors of this variable are all close to zero.
The effect of the percentage of owner-occupied houses is stronger in the northern region and little notorious in the rest of the country. Traditionally, the purchase of a home was a sign of financial and employment stability, conferring a certain social status. Perhaps this is still valued by voters in more conservative municipalities of the north than in other regions.
The negative impact found for having children is in accordance with the results reached by Denny and Doyle (2009) when estimating turnout in people's first election. Also, Plutzer (2002) finds a negative influence when predicting turnout growth, but does not find a significant effect on initial turnout. Regarding the presence of children, no positive effect is described in the studies analyzed. To a better understanding of this phenomenon, it could be important to create measures that distinguish children's age. The exhaustion and time demand associated with raising young children can influence turnout differently from having school-age children, which increase networks that would result in a greater political knowledge (Plutzer 2002). The influence of this factor on turnout, together with the level of education, should be clarified in future research using field interviews.
The percentage of owner-occupied houses is positively associated with turnout in almost all north region and in a coastal area close to Lisbon, including the city. The effect of this variable is close to zero in almost all south region, existing a central area of the country where it is negative, but weak. Similarly, to what happens for the previous variable, it can be said these are reliable results -once more the standard errors of this variable are all close to zero. The studies analyzed are in line with the results obtained for the north region and the coastal area close to Lisbon. Owner-occupied houses is a measure of population stability, being this, in general, positively related with voting (Geys 2006;Cancela and Geys 2016). In accordance, (Squire, Wolfinger, and Glass 1987) say that home ownership has a strong positive impact on turnout, which can be related to a greater community attachment (Plutzer 2002). Some authors analyze another measure of population stability, residential mobility, showing that it depresses turnout (Squire, Wolfinger, and Glass 1987;Kavanagh et al. 2006;Plutzer 2002), which underlines the previous results. Yet, (Highton and Wolfinger 2001) concludes that residential stability positively influences turnout. In south region, where the percentage of owner-occupied houses does not influence turnout, the prices of houses are more accessible than the prices in urban areas municipalities. To buy a house in an urban area municipality requires a higher effort when compared to the same situation in the south. Since it is not affordable to everyone, this economic situation creates a higher social distinction, which, in turn, can influence turnout. It can be the case that economic heterogeneity within geographical areas increases the probability of voting (Bartle, Birch, and Skirmuntt 2017). It may also be the case that politically motivated students renting apartments or living on university campus in larger cities (e.g., Lisbon or Porto) are more prone to vote than homeowners in distant localities.
Analyses based on municipalities may hide economic and sociodemographic heterogeneity within smaller spatial units, such as parishes, and thus fail to account for their political heterogeneity. This limitation concerns with the well-known modifiable areal unit problem (MAUP), whereby aggregating data in different ways may lead to different conclusions. Concerns with the MAUP have attracted the attention of political geography researchers (Lee and Rogers 2019; Zingher and Moore 2019), but there is no generally accepted solution (Xiao 2021). Limited by data availability, a sensitivity analysis for the scale effect could not be carried out in this study, but it is recommended for future studies.
Future research could also analyze the results of the 2015 and 2019 elections (if data of these years become available to compute the predictors), to understand if the identified relationships vary over time. Other variables should be added to the models, trying to increase the explained variance. Socioeconomic indicators not available on Census can be used, trying to cover the topics presented on literature, namely economic factors and political knowledge and interest. Even for some of the variables here included, different or more measures can be used, for example concerning population stability (Geys 2006). Homeownership, the variable that measures population stability, might not be the most appropriate in the actual Portuguese context, particularly for the younger adults, who are affected by the soaring prices of houses in urban areas.
Distance to polling places, which is linked to population density, housing density, and the distribution of polling centers, as well as the role of transport from home to the polling place should also be included in the analysis in future research. The ability to move by the older population considering the location of nursing homes residents, and if there are not special arrangements put in place, may also have a negative effect on voting turnout.
Another aspect which might be also worth investigating is the influence of the weekday on turnout. While some countries (e.g., Portugal, France) vote on Sundays, when people are not busy with work, while others vote on weekdays (e.g., Tuesday in the US).
To understand what influences the lower levels of voting in Portugal, it is important to continue exploring individual characteristics of citizens, since earlier studies show that political and institutional variables add few information in explaining turnout . Moreover, it might be important to extend the research to other indicators of civic participation, besides turnout, as explored by Tavares and Carr (2013), to understand if the variables that influence turnout keep influencing other forms of civic participation. Given the lack of interest concerning political issues (Canas 2004), it might be important to understand if it is only reflected on turnout, or if it is reflected on other forms of participation. An extensive analysis of civic participation might be an indicator of the state of the democracy in Portugal.

Conclusions
This study investigates which socioeconomic variables affect voter turnout in continental Portugal and how the relationship between turnout and those variables varies over geographical space. The SGWR enabled the investigation of local variations in turnout values, simultaneously considering that its relationship with some variables might vary over space. Results show that turnout is a complex process, influenced by a set of sociodemographic variables. While some variables affect turnout differently over the country (percentage of family cores with children aged less than 15, and percentage of owner-occupied houses), others affect it uniformly (percentage of graduated residents, percentage of classic families, and distance to Lisbon or Oporto -the nearest). This highlights the use of a semiparametric approach together with other variables to better understand turnout and for further research on voting apathy.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Funding
This work was supported by national funds through FCT (Fundação para a Ciência e a Tecnologia) under the project UIDB/04152/2020 -Centro de Investigação em Gestão de Informação (MagIC).