The Origins of the Italian Regional Divide: Evidence from Real Wages, 1861–1913

The origins of the Italian North-South divide have always been controversial. We fill this gap by estimating a new dataset of real wages (Allen 2001; Allen et al. 2011) from Unification (1861) to WWI. Italy was very poor throughout the period, with a modest improvement since the late nineteenth century. This improvement started in the Northwest industrializing regions, while real wages in other macro-areas remained stagnant. The gap Northwest/South widened until the end of the period. Focusing on the drivers of regional trends, we find that human capital formation exerted strong positive effect on the growth of real wages.

an improvement of the Italian level in comparison with less developed countries in other continents. Within Italy, we find that the North-South divide was already sizeable at the time of Unification but that an acceleration in the growth of real wages in the Northwest took place only after 1900.
In the second part of the article, taking advantage of our new series, we run a simple conditional growth regression model explaining the rise in real wages by province from the early 1870s to the eve of WWI using as predictors human capital, domestic market potential, endowment of natural resources and of infrastructure, and social capital at the beginning of the period. We find evidence of conditional convergence, with human capital, as proxied by initial literacy rates, playing a major role in explaining changes in WRs.

LITERATURE REVIEW
The debate on the causes of the North-South gap, the so-called "questione meridionale," is almost as old as Italy as a unified state (Felice 2007). 1 The debate has, surely, been rich in ideas and conjectures, as testified, for example, by the opposed views proposed by Luciano Cafagna (1962Cafagna ( , 1971 and Edmondo Capecelatro and Antonio Carlo (1972). The former argued that the Northwest industrialized because it had much greater development potential than any other region, with minimal economic interactions with the South; while the latter denied the existence of a North-South gap at the time of the Unification. The Cafagna interpretation implies that the gap was already large and possibly centuries old, while for Capecelatro and Carlo (1972) it was created by the policies of the newly unified Italy. However, until recently, the debate remained unsettled for lack of data. Early estimates of regional GDP (Eckhaus 1961;Zamagni 1978;Esposto 1997) were very tentative and were mostly ignored by the literature, which relied almost exclusively on anecdotal evidence.
The debate on the causes of the gap has been reinvestigated in recent years. Brian A'Hearn and Anthony Venables (2013), in a broad new economic geography framework, have suggested that the North was richer than other regions at the time of the Unification thanks to its geographical advantages. It had more water, it was more suited for the production of silk, Italy's main staple product, and after 1890, it benefitted from protectionist policies and increasing market access. In the same year, Emanuele Felice (2013) argued for a major role played by institutions along the lines of the Daron Acemoglu and James Robinson (2012) dichotomy of "inclusive" versus "extractive" institutions. The Southern élites resisted any change that could jeopardize their political power-most notably investment in education and health (Felice and Vasta 2015). In contrast, Carlo Ciccarelli and Stefano Fenoaltea (2013) tentatively suggested that marketintegrating policies, abolishing of borders and railway construction, could explain industrial growth at least in some provinces of the South after the Unification.
These conjectures have been subject to econometric testing using three different measures of performance: GDP per capita by region (Brunetti, Felice and Vecchi 2011), the share of industrial occupation from population censuses (Ciccarelli and Missiaia 2013), and labor productivity in manufacturing by province (Ciccarelli and Fenoaltea 2013). Felice (2012) uses the first (GDP per capita) to argue that the growing North-South divergence before WWI depended on differences in endowment of human capital. This hypothesis is also supported by the results presented by Gabriele Cappelli (2016) and Cappelli and Michelangelo Vasta (2017) on the positive effect on school enrollment in the South with the Daneo-Credaro Law (1911) that shifted the organization and the funding of primary school from local authorities, often very poor, to the state. Other works have stressed the role of geographical factors and especially of market potential. Anna Missiaia (2016) finds some evidence for a positive role of domestic market access for the development of the North that were, however, compensated by better access to the world market for the South. Vittorio Daniele, Paolo Malanima, and Nicola Ostuni (2018) confirm the positive effect of market access on the share of industrial occupation by province for benchmark years from 1911 to present, but in their estimates the North had greater domestic and foreign market potential than the South. Roberto Basile and Ciccarelli (2018) also find a positive effect of market potential and literacy on the industrial provincial location of different sectors.
All studies on the causes of productivity growth in manufacturing single out human capital as a key driver, but they disagree on the role of other determinants. Cappelli (2017) and Alessandro Nuvolari and Vasta (2017) find evidence of a positive role of innovative activity, measured by patent data, while Ciccarelli and Stefano Fachin (2017) identify social capital as a major cause. However, this latter result is not confirmed by Cappelli (2017) using different measures of social capital. In all these contributions, other variables, such as infrastructures, water resources, and market access, tend not to be significant.
These quantitative articles have undoubtedly increased our understanding of the origins and causes of the Italian regional divide, but, unfortunately, all these contributions are still based on somewhat fragile data. The regional GDP data refer only to benchmark years and, more importantly, they do not cover the key period after Unification; moreover, they are quite controversial as shown by the recent debate between Felice (2014) and Malanima (2014a, 2014b). The estimates of labor productivity not only rely on some specific assumptions, but also industry accounted only for one-fifth of Italian GDP in 1891 and 1911 (Rey 2002, Tabs. 2 and 3).
When data on economic performance are missing or dubious, proxies have been used. Looking at the results of recent works, we find the North well ahead of the South on social indicators such as heights (A'Hearn and Vecchi 2017), life expectancy, and Human Development Index (HDI) (Felice and Vasta 2015) and also social capital (Felice 2012;Cappelli 2017). 2 Moreover, the strikingly large and persistent differences in literacy rates are particularly relevant given the key role of human capital for economic growth. All these data would support the traditional thesis about the causes of the North-South divide, but they are surely not conclusive evidence, as the correlation between economic performance indicators and social indicators is far from perfect .
In this context of weak data, to provide insights on the origins of the North-South, one can follow the stream of international literature that has used real wages as proxy of economic performance. Until now, there were no series of real wages by regions for the period from Unification to WWI. There were several national series, from the pioneering work by Alberto Geisser and Effren Magrini (1904), to the more recent works by Vera Zamagni (1984Zamagni ( , 1989 for industrial workers from 1890 to 1913, Giovanni Federico (1994, p. 574) for female silk reelers from 1861to 1913, and Fenoaltea (1985, 2002 for construction workers. There is one set of provincial wages, for construction workers, but it covers only the period 1862-1878 (Daniele and Malanima 2017). 3 Consistent with their overall view of the North-South divide, the authors find no evidence of a North-South gap immediately after the Unification. If anything, wages of construction workers were higher in the South (including islands) than in the North-Centre, although the difference was within the margin of error of the estimates. Daniele and Malanima find that wages in the North-Centre rose relative to Southern wages from the late 1860s.
Against this background, the present article is the first systematic attempt to deal with trends in provincial wages for the whole first phase of Italian industrialization from the Unification to WWI.

SOURCES AND METHODS
We estimate Italian provincial real wages according to the Allen (2001) original method. This implies calculating the annual income of a worker taking into account wages and working days and dividing it by a bare bones basket. Allen (2001) defines the WR as: where W = daily wage for male worker, N the number of days worked, D is the number of members of the household in consumption units, P j is the price of the j-th good, and Q j the fixed quantity of the j-th good.
If WR = 1 the male breadwinner wage is exactly sufficient to sustain the household. Allen (2001) suggested using two sets of WRs, corresponding respectively to a mere subsistence (the "bare bones" basket) and to a slightly better standard of living (the "respectable" basket). The former is designed to give each consumption unit the minimum amount of food for work, at the lowest possible cost, plus the barest minimum for lodging, clothing, and fuel. Allen suggested a minimum of 1,940 calories and, lacking information, assumed 250 days of work (5 days for 50 weeks) and an average household of four members, the male breadwinner, his wife, and two children-for a total of three consumption units. He then added rent as a markup of 5 percent to the cost of the basket, yielding a total of 3.15 baskets per household. This has become an international standard. 4 We estimate separate series for 69 provinces (for a detailed list, see Table A1 in Online Appendix A), which we aggregate by region and macro-area (Northwest, Northeast, Centre, South, Islands) as: where s i is the share of the i-th province on the total population of the relevant area (region or macro-area) according to population censuses (MAIC 1864(MAIC -1865(MAIC , 1874(MAIC -1876(MAIC , 1885(MAIC , 1901(MAIC -1904(MAIC , 1914(MAIC -1916 linearly interpolated. All our parameters, except the number of members of households (D), are, in principle, province specific. 5 In particular, we use the information of the number of days worked in each province as reported in an official enquiry of the early 1870s (MAIC-DGA 1876-1879). 6 The provincial data range from a minimum of 192 working days (Cagliari) to more than 300 in few provinces. The national average, simple (251.3) and weighted by population (253.3), is actually very close to Allen's standard of 250 days of work per year, which we use for the international comparisons. 7 However, these differences would not matter substantially for the annual income, since the correlation between the two versions of annual income (with Allen standard and with province specific data of working days) is 0.925. Our bare bones baskets differ across provinces in order to take into account traditional differences in dietary habits (Betri 1998;Teti 1998). 8 Northerners used butter rather than oil and ate much more polenta (a kind of maize gruel) than Southerners, as shown by the composition of gross output of cereals (Federico 1992(Federico , 2000. Therefore, we use different bare bones baskets for: (i) Northern regions that were "regular" consumers of maize; (ii) Northern regions that were "intensive" consumers of maize; (iii) Central regions whose diet comprised of some maize; and 5 As we already mentioned, we have decided to keep the number of members of household (D) equal to 4, corresponding to 3.15 consumption units, in order to allow an international comparative perspective. This value is in line with the data on the number of Italian household members, which are available only from the 1911 Census (MAIC 1914(MAIC -1916. The average was 4.58 and the median was 4.65. The provincial figures ranged from the minimum of 3.75 of Porto Maurizio (Liguria) to the maxima of some Venetian provinces such as: Treviso (6.84), Padua (6.25), and Rovigo (5.80). As a rule, provinces with the largest families, such as the Venetian ones, were characterized by a large number of agricultural households with more than one adult working man. 6 The provincial number of working days was obtained by making simple averages of the number of working days for the different locations reported by MAIC-DGA (1876-1879. 7 For a recent study on the impact of working days on real wages trends in the long run in England, see Humphries and Weisdorf (2017).
(iv) Southern and Central regions where maize was not part of the diet (Table 1). Furthermore, we take into account differences in climate across regions assigning larger quantity of firewood to the cooler areas in the North. For international comparisons only, we have constructed a simplified bare bones basket as in Allen et al. (2011), which assumes that the calories come mainly from the cheapest cereal, with the minimum amount of other food items to provide the physiological requirements of fats and proteins for survival. Thus, in this particular exercise, we substitute all calories from wine, eggs, and half of those from butter and oil olive with the cheapest cereal available in different macro-areas, maintaining the total amount of 1,940 calories.
We estimate daily wages (w i ) and prices ( p i j ) from a variety of (mostly official) sources with details reported in Online Appendix B. We use two main sources for nominal wages of unskilled workers-an enquiry on wages paid by state for public works (MAIC-DGS n.d.) and the monthly Bollettino dell'Ufficio del Lavoro (MAIC ad annum). From the former, we obtained yearly averages of daily wages for navvies (terraiolo) in all Italian provinces (except Parma) from 1862 to 1878. In contrast, we have had to estimate the yearly income of unskilled agricultural workers by using monthly wage data for a large number of specific agricultural tasks, from 1905 onwards, from the Bollettino dell'Ufficio del Lavoro, combining these data with information about the composition of agricultural gross output and the number of days worked for each product given the prevailing technology. 9 Although our series refer to two different sectors, we are confident that the market for unskilled workers was integrated enough for this exercise. Our estimates for the period 1879-1904 ought to be considered more tentative. We were able to find suitable wage data only for 27 provinces (5 in the Northwest, 2 in the Northeast, 1 in the Centre, 12 in the South, and 7 in the Islands). We test the size of the potential bias by computing a new wage series for 27 provinces in 1862-1878 and 1905-1913 and comparing it with our baseline series (with all 69 provinces). The series match in the South and Islands (coefficients of correlation 0.997 and 0.994, respectively), are very similar in the Northwest (0.988) and in the Northeast (0.966), and show a lower but still strong correlation in the Centre (0.893).
We perform two additional robustness checks on the level of wages, relying on two other official publications: the already quoted enquiry  on agricultural wages in the early 1870s (MAIC-DGA 1876-1879) and an enquiry on wages of construction workers in 1906 (MAIC 1907). 10 We compute the ratios to our wages weighting the provincial data with the population (Table 2). Results are satisfactory: the nationwide gap is rather small and constant along time, and also the regional ratios do not differ much from 1, with the exceptions of the Islands in 1870 and of the Northwest in 1906. This suggests a fairly high degree of integration in the local labor markets. Our main sources for prices are for 1862-1873 MAIC-DGS (1886), for 1874-1896 the weekly Bollettino settimanale dei prezzi (MAIC-DGS ad annum), and for 1897-1913MAIC (1914. MAIC-DGS (1886) reports wholesale prices for wheat, wine, olive oil, and corn and retail price of meat from 1862 to 1885 for a varying number of provinces-up to 23 for wheat. 11 The Bollettino covers all provinces and reports retail prices of bread and meat and wholesale prices of wine, corn, olive oil, and firewood from 1880. MAIC (1914) reports the prices paid for bread, wine, olive oil, meat, butter, and eggs by the Convitti nazionali (boarding schools), which were probably somewhat lower than retail prices for ordinary consumers. When direct observations of bread prices are lacking, we convert wheat prices into bread prices by estimating a "bread equation" (Allen 2001): where Pbread i is the price of bread and Pwheat i is the price of wheat in province i, and province i and year j are provincial and yearly dummies. We estimate the bread equation with data on bread and wheat prices for the period 1880-1896 (MAIC-DGS ad annum) with different specifications. Our preferred estimation yields a coefficient of b = 0.485 (Table  B1 in Online Appendix B). We estimate regional prices for fava beans by applying the difference in levels in the 1850s (Bandettini 1957;Felloni 1957;Delogu 1959) to the nationwide series from ISTAT (1958). Unfortunately, we have not been able to find regional prices for lamp oil, candles, soap, and cotton cloth. We use the series from ISTAT (1958) for the first three items, while, for cotton cloths, we adjust the price of cotton yarn from Ernesto Cianci (1933) for 1870-1913 and then extrapolate the resulting series back to 1862 using the price of raw cotton in the United kingdom from Brian Mitchell (1988). Calculating a single nationwide series for these items of the bare bones basket for all provinces might result in some spurious reduction of the variance of our price index. However, these goods accounted for a very small share of total expenditure and, thus, the distortion is likely to be very small. Finally, following Allen (2001), we add 5 percent to the cost of basket for rent.
Summing up, we have fairly detailed and reliable data on prices for the whole period, and, in particular, for the period 1874-1896. In contrast, the data on nominal wages are complete for the initial (1862-1878) and final (1905)(1906)(1907)(1908)(1909)(1910)(1911)(1912)(1913) periods, but for the intervening period they are the result of the collation of somewhat heterogeneous sources, and, for this reason, as we have already noted, are more tentative. Furthermore, by construction, the Allen method rules out substitution among goods when relative prices change. Therefore, yearly series are bound to fluctuate widely when individual prices of major items in the basket are characterized by high volatility.

TRENDS IN REAL WAGES IN COMPARATIVE PERSPECTIVE
In Figures 1A and 1B, we provide an international comparative perspective of Italian living standards based on 250 working days and the simplified bare bones basket, described in the previous section. 12 Figure 1A shows a large gap in WR between Italy and the most developed European countries, here represented by Allen's estimates for three large cities. Up to the early 1880s, the Italian WR remained around 1.2-that is, an unskilled laborer working full time could earn a little more than what is necessary to support his family at the minimum subsistence level. The WR reached Sources: Authors' own elaborations on data kindly provided by Allen previously presented in Allen (2001) and in Allen et al. (2011).

FiGure 1b WR IN COMPARATIVE PERSPECTIVE: ITALy VERSUS LESS DEVELOPED COUNTRIES
Notes: All data assume 250 working days per year. We double the WR for Vienna, because the data refer to a "respectable" basket, which should be worth twice the bare-bones basket according to the Allen method. Actually the basket for Vienna is richer than Allen's "respectable" one, and, indeed, Cvrcek (2013, footnote 17) warns that his estimates might be slightly undervalued. Sources: Authors' own elaborations on data from Allen et al. (2011), Cvreck (2013, and Cha (2015).
1.5 for the first time in 1888 and fluctuated around 1.6-1.7 until the early 1900s. Thereafter it started to grow, with some acceleration, but, on the eve of WWI, the WR was still less than 2. As a result, the gap with the most advanced countries, where the WRs considerably increased, had further widened. The WR for Milan shows that the Northern city moved somewhat above the nationwide averages. 13 Italy was quite poor even when compared with other European peripheral countries, such as Austria, and with less developed countries on other continents ( Figure 1B). The Italian WR remained at the same level of most of the countries to the 1880s. Its growth since the 1890s brought it within reach, and in some cases to a higher level of the urban wages in Japan and China. On the eve of WWI, Italy was slightly above the level of all less developed countries of the sample. This result may seem surprising, but this low level of real wages, as well as its upward trend, is consistent with the estimates by Daniele and Malanima (2017). 14 Furthermore, our findings are consistent with the available evidence on heights (Federico 2003;Peracchi 2008; A'Hearn and Vecchi 2017; for an international comparison, Baten and Blum (2014)).
The gap between Italy and the advanced countries was much smaller in GDP per capita than in WRs and Italy's GDP was significantly higher than the Japanese and above all the Chinese measures. 15 This difference in Italy's relative position in terms of GDP and WR can be accounted for by several factors such as a lower labor income share, or higher labor supply per capita, or significant differences in workforce composition in terms of skills, or in the ratio between prices of wage goods and the implicit deflator of GDP (Angeles 2008). Regardless, in spite of the modest improvements in the pre-WWI period, Italian workers remained very poor throughout the entire period between 1861-1913. 13 It is worth noting how our estimates seem in line with previous contributions. For example, Malanima (2013b) shows that in 1913 wages of building workers in Northern and Central Italy were about 30 percent of the corresponding English wages. As for skilled workers, according to Zamagni (1989, p. 119), Italian industrial real wages in 1913 were about 47 percent of the corresponding British wages and 70 percent of the German ones. 14 Daniele and Malanima (2017, tab. 8) estimates a daily average of 1.23 baskets per person for navvies in 1862-1878. They assume a basket delivering 3,044 calories-that is, 56 percent higher than ours. Thus, their estimate is based on a basket that, in terms of calories, corresponds to 1.91 of our basket and, therefore, to a WR of 0.60. This figure might appear implausibly low, but one has to remember that Daniele and Malanima's basket features wheat bread rather than the cheaper polenta (maize)-and, thus, it is not strictly speaking a bare bones basket. Furthermore, rather than estimating a bread equation, they convert wheat prices into bread prices using an area-specific fixed coefficient, which they assume to be higher in the North-Centre (1.7) than in the South (1.4). 15 According to the data of the Maddison project release 2013 (Bolt and van Zanden 2014), the Italian GDP per capita in 1913 was 46 percent of the British one, 57 percent of the Dutch one, 63 percent of the German one, 67 percent of the Austrian one (at 1995 boundaries), 77 percent of the Chilean one, while it exceeded the GDP of Japan by 66 percent and was about four times higher than the Chinese one.

REAL WAGES AND THE ITALIAN REGIONAL DIVIDE
In Figure 2, we present our main estimates of the WRs for the Italian macro-areas. The gap between North (both West and East) and the Continental South was already sizeable at the time of the Unification and real wages remained more or less flat in the following 20 years. Thus, our results seem to be more in line with the conventional wisdom than with the revisionist approach endorsed by Malanima (2007, 2017). In particular, we do not find any support for the notion of a sudden and drastic impoverishment of the South due to the Unification (Capecelatro and Carlo 1972).
Real wages in the Northwest started to grow from the 1880s, most likely as a consequence of the early industrialization of the "industrial triangle." The trend accelerated at the turn of the century, peaking in 1910-1911 around 1.75. 16 In contrast, in other macro-areas, real wages fluctuated without any clear trend until the first years of the twentieth century. From 1905 to 1913, WR in the Islands increased considerably (59.9 percent), fully recovering after a previous collapse; WR also increased substantially in the Northeast (15.6 percent), the Centre (17.2 percent), and the South (10.3 percent).
The cases of the Islands and Centre deserve some additional comments. The very low WRs in the Centre are consistent with the evidence on incomes for sharecroppers, who accounted for a large majority of the agricultural workforce, in the early twentieth century. 17 Sharecroppers received incomes in kind as lodging and they had an implicit right to be subsidized in case of distress. Furthermore, market wages were reduced by the supply of labor from members of sharecropping households moonlighting for causal work. A conceptually similar argument can account for the relatively high level of WR in the Islands, which are higher than those prevailing in the continental South. In this case, as also suggested by Daniele and Malanima (2017), the pattern of settlement of the agricultural workforce in very large agglomerations ("agro-towns"), typical of extensive cultivation, prevented women to seek agricultural employment. 18 These low employment rates for women lead to higher wages 16 The later decline in 1912-1913 reflects a sharp rise in wine prices. 17 The average yearly income for adult male laborers unit was 251 lire in a sample of 52 Tuscan farms for 1891-1900 (Linari 1902), 485 in Valdelsa in the province of Siena in 1896, 489 in Valdarno and 396 in Pistoia in 1895 in the province of Firenze (Guicciardini 1907). We estimate a yearly wage of 392 lire in Tuscany in the 1890s. 18 As late as 1911, the share of population living in municipalities with more than 10,000 inhabitants was 72.9 percent in Sicily, 42.4 in Continental South, and 38.9 in the kingdom of Italy without Sicily (MAIC 1914(MAIC -1916, Relazione finale, table VII*). The concentration in agro-towns increased the distance between fields and houses so that workers had to cover long distances and spend several days in a row in the fields. for man in order to guarantee the survival of the household. 19 Indeed, the gender ratio of agricultural workers (female over males) for the Islands according to population censuses was 0.13 in 1871 and declined to 0.11 in 1911, while in the rest of the country it increased from 0.63 to 0.73 (MAIC 1874(MAIC -1876Vitali 1968).
Furthermore, the series for the Islands exhibits a sharp rise in the early 1870s, exceeding in 1872 the level of the Northwest. We interpret this peak as an outcome of the substantial investment in public works, and especially in railways. During the 1860s railway lines were opened for the first time in Sardinia and the Sicilian network was greatly expanded, so that the ratio of new lines per population in the Islands was the highest in Italy in 1868-1873 (Table 3). This resulted in the market for construction workers being very tight (Daniele and Malanima 2017). The figure in Table 3 corresponds to about a kilometer of new railways built every 9,200 inhabitants in the Islands, versus a kilometer every 22,600 inhabitants in the whole country (and 133,000 in the Northeast). This can also account for the 17 percent gap between construction and agricultural wages prevailing in 1870 in the Islands ( Table 2).
The discussion so far has focused on macro-areas, but, as strongly stressed by several authors (Salvemini 1984;Pezzino 1987

FiGure 2 WR FOR UNSkILLED WORkERS
Sources: Authors' own elaborations (see text and Online Appendix A). 1990), there were some dynamic areas within the South, while in the Northwest there were agricultural areas hardly touched by industrialization. We explore differences within macro-areas by mapping WR by province in 1871 (the first year in which Italy had the 1911 borders) and in 1911 (Figure 3). 20 The provincial maps (Figure 3) show that divergences within regions were still quite significant in the 1871. 21 Most Northwest provinces show comparatively high WRs, but several other provinces, scattered all over the country, also have relatively high levels of WR. By 1911, there is a clear North-South gradient and the provinces with (relatively) high WRs are to be found, exclusively, all over the North. Data show sizeable differences within macro-areas, as appears evident looking at yearly series of WR by region ( Figure A1 in Online Appendix A). For instance, the increase in WR from 1905 to 1913 was much more impressive in Sicily (+69 percent) than in Sardinia (+17 percent), while the overall modest growth in the Continental South was determined by wide and largely uncorrelated fluctuations in the underlying provincial WR. In the Northwest, WRs grew fairly steadily in Piedmont and Lombardy, while in Liguria they remained broadly constant, at a rather high level for Italian standards, in the 1860s and 1870s, and boomed in pre-war years. In the long run, the coefficient of variation of regional WRs declined by a couple of points, from 0.212 in 1870-1878 to 0.194 in 1905-1913. Interestingly, the coefficient of variation was stable and also very similar in Austria-Hungary (0.195 in 1870-1878and 0.198 in 1905-1910 (Cvrcek 2013).
In Table 4, we explore this decline by reporting the population weighted coefficients of variation by macro-area. The coefficient of variation dropped from 0.302 in 1863 to 0.257 in 1911 for Italy (sigmaconvergence). 22 Similar trends are evident in the Centre, in the Northwest and in the Northeast, while there are mixed trends in the South and in the Islands.
So far we have focused on the WR as proxy for the standard of living of laborers, but real wages are also often used as a proxy for GDP. Figures 4A and 4B show regional GDP per capita and real wages in 20 We have chosen these years because we have a coverage for all provinces (see Online Appendix A). Likewise, the regional series for Liguria, Marches, Umbria, Latium, Basilicata, and Sardinia feature gaps in 1879-1904 because we have been unable to find wages series for any province in those regions. 21 We are using a different set of thresholds because otherwise the 1911 figure would appear too uniform. 22 Sigma convergence (divergence) is defined as a decline (increase) in the coefficient of variation (the ratio of standard deviation to the mean in a given year). See, for example, young, Higgins, and Levy (2008). GDP and real wages can differ for a number of factors relating to income distribution, characteristics of labor supply, and relative prices. A comprehensive analysis of the relative contribution of these factors is beyond the scope of this article. However, we can provide a rough glimpse of the role of income distribution by comparing our estimates with the estimates of Gini coefficients for the Centre-North and the South by Nicola Amendola, Andrea Brandolini, and Giovanni Vecchi (2011, fig . 7.9). The relative position of our WR estimates for these Centre-North and South areas (circles in Figures 4A and 4B) and their movements over time are broadly consistent with these Gini coefficients. Indeed, in 1871, income distribution was more unequal in the North-Centre than in the South. Successively, in 1911, inequality declined in the North-Centre and increased in the South. Needless to say, all inferences for 1871 are speculative given the underlying fragility of the GDP estimate in that year.
Finally, in Figures 5A and 5B (constructed using histograms with Italy = 100), we compare our WRs for 1871 and 1911 with HDI, as a broader measure of living standards (Felice and Vasta 2015), and literacy rates, including also GDP per capita for completeness. Overall, both in 1871 and 1911, there is a broad correlation between HDI and literacy on    one side, and WRs on the other side, with the two, previously discussed exceptions, the high wages in the Islands in 1871 and the relatively low wages in the Centre in 1911. Additionally, it is worth noting what is perhaps the most important feature of Figure 5A: namely the extremely high levels of literacy rate already achieved in the Northwest in 1871. 23

CHANGES IN WR: PROXIMATE AND ULTIMATE CAUSES
We start our analysis of growth in the WR, by decomposing the overall change in terms of changes between prices and wages. In Table  5, we report the average annual growth rates of WR as the difference between the annual growth of nominal wages and the annual growth of nominal prices. 24 Figures in bold indicate the prevailing determinant in each macro-areas for three different sub-periods. The results highlight a substantial difference between periods. Over the whole period, and especially during the Giolittian boom (1895)(1896)(1897)(1898)(1899)(1900)(1901)(1902)(1903)(1904)(1905)(1906)(1907)(1908)(1909)(1910)(1911)(1912)(1913), the WR increased from a growth of wages despite price rises. With the notable exception of the Northeast, the rise in wages accounted also for the very modest increase for the period 1862-1880. In contrast, in 1880-1895, the WR rose mostly due to the decline in world prices of cereals, which cut the cost of the bare bones basket in spite of protection on wheat.
The traditional historiography has interpreted the patterns of Table 5 as driven by two proximate causes: industrialization in the Northwest and emigration for the rest of the country and, in particular, for the rise of wages in the South and in the Islands in the last sub-period (Taylor and Williamson 1997;Hatton and Williamson 1998). Alan Taylor and Jeffrey Williamson (1997) argue that emigration was a key factor mitigating Italian GDP divergence relative to the European core in the pre-WWI period.
However, the literature surveyed earlier highlighted five major possible "ultimate" drivers of the growth of real wages: human capital, resource endowments, market potential, infrastructure, and social capital. Here, we provide an appraisal of the relative strength of these factors by using a simple growth regression framework, which allows us also to assess possible trends of convergence as a result of growing integration of labor markets. Ideally, one should have adopted a panel dynamic approach in order to have more precise estimates and to limit possible concerns about endogeneity (Durlauf, Johnson, and Temple 2005). Unfortunately, we are constrained to use a long-run specification with co-variates for the initial year (1871), because we lack a full provincial coverage of real wage observations from 1879 to 1904. Accordingly, we estimate the following equation using a provincial cross-section: ln(SocCap 1871 ) + e · ln(MktPot 1871 ) + f · ln(Rail 1871 ) + g · ln (Water) + e, 24 We minimize the risk of spurious results arising from the selection of specific years for defining the time-intervals of Table 5 by using Hodrick-Prescott filtered series of wages and prices and computing the corresponding WR. where, WR -

1911
 is the average compounded growth rate of WR. We have the following co-variates: i. Literacy 1871 is our measure of human capital and it has been retrieved from MAIC (1874); ii. SocCap 1871 is the index of "cooperative norms" constructed by Cappelli (2017), which is computed as the average of donations per capita to Opere Pie (charities) and of the number of mutual aid societies per capita, both relative to the Italian average; iii. MktPot 1871 is the measure of domestic market potential at provincial level constructed by Basile and Ciccarelli (2018); iv. Rail 1871 is the kilometers of railway per square kilometer from Ciccarelli and Peter Groote (2017); v. Water is our measure of the natural resource endowment. In this context, the literature has mostly emphasized the role of water resources that provided some geographical areas with an enhanced "attractiveness" for industrial activities (A'Hearn and Venables 2013). Following Nuvolari and Vasta (2017), we proxy this factor using the (yearly) flow of rivers, canals, and streams in the province (measured in m3/s). The source of this variable is the website: www.acq.isprambiente.it/pluter/ (see Nuvolari and Vasta (2017) for further details).
We expect these covariates to exert a positive impact on the growth of WR. Consistently with the standard growth regression exercises, we also include in our specification the initial value of the real wages (WR 1871 ). The sign of this variable is undetermined a priori: a negative sign will indicate a convergence trend, while a positive sign will indicate divergence. Table 6 reports the results of our regressions. Columns (1)-(3) refer to the entire national sample, Columns (4)-(6) to the Southern provinces, and Columns (7)-(9) to the North and the Centre.
For the whole period we find that the only consistently significant driver is human capital, as proxied by the literacy rate. In particular, this turns out to be significant in the national and in the North-Centre subsample, but, interestingly enough, not in the South. The coefficient for literacy (0.00807) in the most complete national specification (Column 3) implies that moving in 1871 from the province with the lowest literacy rate (Caltanissetta = 8.3 percent) to the province with the highest literacy rate (Turin = 57.7 percent) would lead to an increase of the annual growth rate of real wages of about 1.6 percent. 25 This is a relatively strong impact since the average of the yearly growth rate of real wages of all provinces in the period 1871-1911 was 0.9 percent. The order of magnitude (2 percent per year) of this impact is similar to the effect of literacy on industrial productivity growth estimated by Cappelli (2017, p. 353) using the same thought-experiment. None of the other variables are significant. However, literacy, market potential, water resources, the railway endowment, and social capital were already higher in the Northwest in 1871 and, thus, they may account for the gap in WRs at the time. Our results suggest that, amongst these variables, only literacy fostered subsequent growth in real wages in the Northwest, offsetting a possible disadvantage of high wages in these regions. 26 Overall, these effects are consistent with some of the most recent contributions presented earlier. We also find a negative and significant coefficient across all specifications from the initial level of wages. This implied that yearly b convergence rates (Table 6 bottom row) are in line with those found for unskilled urban workers (from 1.8 to 2.8 percent) and somewhat lower of those of agricultural workers (from 3.4 percent to 4 percent) in Spain in the same period (Roses and Sánchez-Alonso 2004). The results by sub-sample indicate a stronger convergence, in the conditional convergence regressions, in the North-Centre than in the South. This difference is also highlighted by Figures 6A (North-Centre sample) and 6B (South sample), which show partial regressions diagrams of growth rates and initial real wage conditional to the literacy variable (Columns 2, 5, and 8 of Table 6). Overall these different patterns of convergence within macro-areas result in divergence between North-Centre and South, as already shown in the maps of Figure 3, and in the circles in Figures 4A and 4B.
We interpret this convergence profile as an outcome of short-range domestic migration and we buttress this view with the data shown in Table 7. In Table 7, we compute total migration according to two different hypotheses: (i) a "high emigration hypothesis," which refers to gross migration for the period 1876-1911 (Table 7, Panel A), and (ii) a "low emigration hypothesis," which includes only net (gross less returns for 26 We cannot compute the annual rate of growth as a time trend of time series because, as mentioned previously, we do not have complete yearly series for the period 1879-1904. Hence, we have computed the growth rate using the initial and final observations. As a robustness check, in a non-reported exercise, we have run the same regressions using the average WRs of the years 1869-1873 and 1909-1913 as initial and final observations, rather than the yearly values of 1871 and 1911, obtaining substantially similar results. We have also estimated spatial autoregressive and spatial error specifications (using both a "neighboring provinces" and distance-based version of the spatial weight matrix) finding very similar results of Table 6 in terms of size and significance of the coefficients. These results are available on request.  the 1905-1911 period only) migrations (Table 7, Panel B). The former is clearly an upper bound, as it includes emigrants who returned home before 1911 (possibly with repeated rounds of migration) or who died abroad, while the latter is a lower bound since data on net migration are available only for a limited period of time. Mass migration started in the 1880s, but the data before 1904 refer only to gross flows, even though a substantial number of Italians returned home (and were, thus, counted in the 1911 census). 27 In particular, we compute the percentage of Italians living outside the province of birth in 1911 as the sum of people living in another province of the same macro-area (short-range migrations, Table 7, Column 1) and in another macro-area (medium-and long-range migrations, Column 2), as registered by the 1911 Census (MAIC 1914(MAIC -1916. Thus, Column 3, the sum of Columns 1 and 2, is the percentage of people who migrated within Italy. In contrast, people emigrated abroad are presented in Column 4. Finally, we obtained as residual Column 5, the percentages of people that, in 1911, lived in the same province of birth. Table 7 highlights two main points: first, short-range migrations were considerably greater within the Northwest than within any other macroareas, and, second, many more Southerners moved abroad than to other provinces. The census data for 1911 reveal two other important features of the migrations process, not reported here for space constraints. First, about two-thirds of the people born in the Northeast, and living in another macro-area, were actually residents in the Northwest. Second, in 1911, about 15 percent of the inhabitants of the Northwest were born outside the province of residence. In a nutshell, we observe a dual movement: the early development of an Italian sub-national labor market in the industrializing Northwest (Federico 1985), with some attraction also from the neighboring macro-areas, and a massive integration of the South (but also of the Northeast) with the transatlantic labor market (Hatton and Williamson 1998;Gomellini and Ó Gráda 2013). 28 Overall, this interpretation is consistent with the existence of sigma-convergence in the North and in the Centre and of (modest) sigma-divergence in the South and Islands (possibly due to different rates of foreign migrations across provinces in the South) as shown in Table 4. 27 The ratio net/gross migration in 1905-1911 ranged from 94 percent in the Northeast to 65 percent in the South, with a nationwide average of 79 percent. 28 In a recent paper, Spitzer and Zimran (2018), using height data, have shown that Italian migration in the United States was negatively selected at national levels, but positively selected at provincial level. This is broadly consistent with our data showing relatively lower WRs in provinces/regions with large migration flows to the United States.

CONCLUSIONS
In this article, we estimate real wages in Italy at provincial level from the Unification to WWI using the internationally comparable method of Allen and associates (Allen 2001;Allen et al. 2011). We can sum up our results in the following points.
First, in the Liberal Age Italy was quite poor from an international comparative perspective. The modest growth of real wages since the 1880s was sufficient to converge to and then exceed other less developed countries, while the gap with Northwestern Europe continued to widen until WWI. According to the simplified bare bones basket, the all-period peak in the WR just before the war (1.97 for Italy or 2.32 for Milan) corresponds to only 25 percent of the London real wages in the same period.
Second, at the time of the Unification, the continental South was poorer than the North, and the gap with the industrializing Northwest went on growing until the beginning of the twentieth century.
Third, the long-run increase in WR reflected mainly the growth of nominal wages, which was dampened by growth of prices in the 1860s and 1870s and again since the mid-1890s. In contrast, the decline in world prices (mostly cereal) accounted for most of the small improvements in the 1880s and early 1890s.
Finally, we explore the drivers of the regional trends in WR using a growth regression framework. We find a general convergence trend that is particularly strong in the North and in the Centre and it is probably related with domestic migrations. Human capital formation, measured by literacy rate, has had a strong positive effect on the growth of real wages and, thus, arguably, on the process of economic growth.