Urban Agglomeration Effects in India: Evidence from Town-Level Data

Combining multiple data sets for India, we estimate the elasticity of wages with respect to town population and density between 1% and 2%, which is smaller than estimates in the literature based on district-level analysis. We also find that the employment share of firms with 10 or more workers—which typically describes firms that operate in the formal sector—is positively associated with city population and negatively associated with city density. Town characteristics such as infrastructure availability, geographic location, educational services, and industrial structure also play a role in explaining city productivity and the presence of relatively large firms. Overall, we interpret our results to suggest that there is scope to realize more fully urbanization's potential by addressing issues related to urban planning, infrastructure, and public service delivery, as has been emphasized previously by observers of Indian urbanization.


I. Introduction
Like other developing economies, India is urbanizing. According to census data, the share of India's population residing in urban areas increased from 20% in 1971 to 31% in 2011. Expectations are that the process of urbanization in India will continue, if not accelerate. With cities widely believed to be "engines of growth," urbanization represents an important source of prosperity for India. This is especially true given the scope for further urbanization.
However, the extent to which India's urbanization will play the positive role that is expected of it is unclear. Ahluwalia, Kanbur, and Mohanty (2014) note that in comparison to other fast-growing economies in Asia, urbanization in India has been relatively slow, largely unplanned, and characterized by underinvestment in urban infrastructure and public service delivery. By way of comparison, the McKinsey Global Institute (2010) reports that India's annual capital spending in urban areas is only $17 per capita, which compares unfavorably with annual capital spending in the People's Republic of China of $116 per capita. Generally, and common to the situation in many developing economies, there are various aspects of urbanization in India that may act as a brake on economic growth. For example, Duranton (2015) notes that cities in developing economies tend to be much less functionally specialized than cities in advanced economies. Large cities in developing economies tend to be characterized by many ancillary activities that could be undertaken in smaller cities. This adds to urban congestion and detracts from the benefits that urban agglomeration is expected to bring.
In this paper, we use data from various sources, including India's Economic Census of 2005 (EC 2005) and the Labor Force Survey (LFS) of 2004-2005, to shed light on whether Indian cities are functioning as engines of growth. We do this by examining the effects of urban agglomeration on proxies for worker and firm productivity at the town level. 1 The focus on towns as the unit of analysis distinguishes our work from previous literature on India such as Chauvin et al. (2017) and Lall, Shalizi, and Diechmann (2004), who examine urban agglomeration issues at the district level, which is a higher level of geographic aggregation.
We analyze how measures of city-level average (nominal) wages and the share of employment in formal or modern firms (defined as firms with 10 or more employees) are related to measures of urban agglomeration such as population and density. In addition to instrumenting for urban agglomeration, we control for a variety of city characteristics that capture natural amenities, infrastructure availability, geographic location, and educational services in our agglomeration regressions. We also examine how city wages and the prevalence of formal firms relate to the industrial structure of cities; for instance, whether cities with a more diversified industrial structure or a larger share of employment generated through manufacturing activities have higher wages and more formal firms.
While our use of nominal wages is typical in the empirical literature on the effects of urban agglomeration-they are reasonably good indicators of labor productivity as opposed to real wages, which better capture standards of living-the specific measure of city-level wages is less typical. Our city-level wages are derived from information on the sectoral composition of employment across towns and average wages by industry and district. Thus, as a robustness check on our results, we use an approach first employed by Ciccone and Hall (1996) in which district-level wages are related to an index of district-level density that is a nonlinear function of the density of constituent cities.
With regard to our analysis of urban agglomeration based on the share of employment in larger firms, firms with 10 or more employees typically (i) must comply with rules governing industrial labor and workplace safety, (ii) comprise the formal sector, and (iii) pay better wages and are more productive (Asian Development Bank 2009). 2 If economies of agglomeration are important, one would expect to see a greater share of total employment in such firms.
Our analysis indicates that the elasticity of wages with respect to both city population and density is between 1% and 2%, which is smaller than existing estimates as well as estimates from our replication exercise based on analysis at the district level. Our estimates for India are further supported by the results derived from an application of the Ciccone and Hall (1996) approach. We find that a larger employment share in formal firms is positively associated with city population but negatively associated with city density, possibly reflecting the congestion-related effects of higher density. Town characteristics such as natural amenities, infrastructure availability, geographic location, educational services, and industrial structure also play a role in explaining city productivity and the presence of formal firms. Overall, we interpret our results to suggest that urbanization does hold promise for promoting growth and good jobs in India. However, there is scope to further realize urbanization's potential by addressing issues related to urban planning, infrastructure, and public service delivery as emphasized by Ahluwalia, Kanbur, and Mohanty (2014). Rosenthal and Strange (2004) and Combes and Gobillon (2015) provide comprehensive surveys of studies that estimate agglomeration effects in various economies and regions. Generally, the elasticity of productivity, whether measured by wages or total factor productivity, falls between 1% and 10% with respect to city population or density. According to Combes and Gobillon (2015), studies using city-level productivity measures yield higher elasticity estimates (4%-7%) than those using individual data, which typically reach about 2%. Melo, Graham, and Noland (2009) undertake a meta-analysis of the empirical estimation of agglomeration effects and conclude that the results are highly context specific depending on factors such as the economy and industries studied, and the controls used for unobserved heterogeneity. Among the large number of empirical studies on the topic, only a small share look at developing economies. Thus, evidence from the developing world is lacking (Duranton 2015). Lall, Shalizi, and Deichmann (2004) use plant-level data to estimate the effect of district urban density on firm productivity in India. Among the nine manufacturing industries examined, urban population density has a significantly positive effect only on the manufacturing of cotton textiles with a point estimate of about 9%. The effects on other industries are not statistically significant and some are even negative.

II. Literature Review
More recently, Chauvin et al. (2017) estimate the elasticity of individual income (of prime-age males) to urban population and density at the district level, and instrument population and density with historical population and density in 1980 and 1951, respectively. When nominal income is examined, most estimates of elasticity are around 7%-8%, with some instrumental variable (IV) estimates of the population effect being fairly large. In the case of real income, the estimates are essentially the same at around 6%. Again, urban density demonstrates robustness, while IV estimation generates some unreasonably large effects for population.
Most studies on urban economies in India treat districts-the second administrative level after states-as the urban unit (see, for example, Lall, Shalizi, and Deichmann 2004;Ghani, Kanbur, and O'Connell 2013;Chauvin et al. 2017). This is because most available data from labor and enterprise surveys contain geo-information down to the district level. However, Indian districts often cover large rural areas and many contain multiple geographically independent urban areas. A more proper definition of a city in India would be towns, which is the administrative division below the district level and whose rural counterparts are villages. Ciccone and Hall (1996) show that state-level average density in the United States has no effect on state-level labor productivity once heterogeneity in density within states is accounted for. Therefore, examining agglomeration economies at the town level in India is considered one of the main contributions of this paper.

III. Data
This paper combines cross-sectional, establishment-level data from India's 2005 Economic Census; town-level data from the Town Directory of the 2001 Population Census; and individual-level survey data from the 2004-2005 Employment-Unemployment Survey to explore agglomeration effects across cities in India.
The economic census, which was conducted by the Central Statistics Office of the Ministry of Statistics and Programme Implementation, is a countrywide census of establishments engaged in all economic activities except crop production and plantations. The key purpose of the economic census is to provide a sampling frame for follow-up sample surveys intended to collect more detailed sector-specific information on the nonagricultural economy. In this study, we employ the recently released fifth edition of the Economic Census covering the year 2005. The data allow for the geographic location of establishments to be identified at the town level, which is an administrative level below the state and district levels that is the equivalent to a city in the Indian context. 3 Establishment-level information in the EC 2005 includes number of employees, major activity in terms of a four-digit industrial classification, type of fuel used, registration status with government authorities, and type of ownership. Approximately 17 million establishments surveyed were recorded under the fourdigit 2004 National Industrial Classification. The 304 unique four-digit categories can be simplified into 59 two-digit aggregates. The analysis put forth in this paper includes all industries after reclassifying them further into 13 major industrial categories. Table A.1 presents the reclassification of the two-digit National Industrial Classification into 13 categories. Table A.2 reports the distribution of all firms as well as firms with 10 or more employees among these categories.
We also employ the Town Directory of the 2001 Population Census to introduce our town-level agglomeration measures, that is, population and density. This survey contains rich information on geography and climatic amenities, demography, infrastructure provision, government revenues and expenditures, and social and educational services available at the town level. This is the only source we are aware of that has town-level population and land area data that is close to 2005. In addition, we use the town-level characteristics as explanatory variables in our agglomeration regressions to examine how urban productivity varies with city amenities, infrastructure, and access, among other characteristics. 4 Our final data set allows us to work with about 2,800 Indian cities from around 560 districts with a population of at least 10,000 in 2001. Table 1 presents the distribution of our cities by classification and population. As defined by the Census of India, towns are classified as either census towns or statutory towns, while an urban agglomeration is a construct where contiguous urban areas are included as part of a town for urban planning purposes. 5 Urban agglomerations must consist of 4 The Appendix describes our efforts to match towns in the Town Directory of the 2001 Population Census with the EC 2005, and defines our town-and district-level variables. 5 Census towns are administrative units satisfying the following three criteria simultaneously: (i) minimum population of 5,000; (ii) 75% or more of the male working-age population engaged in nonagricultural pursuits; and (iii) population density of at least 400 persons per square kilometer. Statutory towns are defined as urban-like areas with a municipal corporation, municipality, cantonment board, notified town area committee, or town council. at least a statutory town and its total population (all constituents combined) should not be less than 20,000. Both towns and urban agglomerations are treated equally in our analysis and referred to as cities.
The majority (86.1%) of cities in India had a population of less than 100,000 in 2001. There were only 73 cities in India, or 2.6% of the total, at the other end of the spectrum with a population of 500,000 or more. Overall, urban agglomerations accounted for 13.5% of all cities in the data and are disproportionately more prevalent in cities with a population of 100,000 or more. Table 2 reports the summary statistics for our sample. The average city had a population of about 96,000 and India's urban population amounted to 270 million in 2001. The figures suggest India was still at an early stage of urbanization in 2001 given its total population of 1.06 billion at the time. However, Indian cities were larger, denser, and more numerous in 2001 than in 1981 when the average city size was about 59,000 people and the average density was 3,206 people per square kilometer. 6 These figures would increase by 64% and 57%, respectively, over the next 20 years. We also see considerable variation across cities in terms of both population and density. Table 2 also presents city-level measures of climate, infrastructure provision, and educational services. The diversity of these characteristics across Indian cities is noteworthy. For instance, the highest annual maximum temperature is three times that of the lowest, and the highest average rainfall is 300 times the lowest. There are towns without any electricity connections, paved roads, or educational institutions, while an average town has 19,494 electricity connections, 74 kilometers of paved roads, and 2 educational institutions.

IV. Methodology
As is common in the literature, we estimate agglomeration benefits using data on wages-a proxy for productivity-and testing whether wages increase with a city's population and density. Ideally, we would like to estimate agglomeration regressions of the following type: where y ic is the wage of person i living in city c; S c is the population of the city-so as to capture the concept of scale, or some other measure of agglomeration, such as density (population divided by area of the city); X i is a vector of individual characteristics; Z c is a vector of city characteristics; and ε ic is the error term. The coefficient, β, captures the elasticity of wages with respect to city population (or density), which is our main interest.
The main difficulty with estimating this agglomeration regression in the Indian context is data related. In particular, individual wage information from the LFS only identifies the district and state of the respondent, as well as an urban or rural location. While it is possible to use the urban component of Indian districts as the unit of analysis to estimate agglomeration economies, as done recently by Chauvin et al. (2017)-so that individual-level wages are regressed on district-level measures of agglomeration (district population or density)-this is less than ideal as districts may not be well-defined economic units over which economies of agglomeration may be operating. A district in India can cover between 1 and 22 cities and towns, and these can be spread over an area covering thousands of square kilometers.
To work at a finer geographic unit, several options are available. One approach is to work along the lines of Ciccone and Hall (1996). This would entail working with average wages at the district level and regressing these on an index of the density of a district that is defined in terms of the density of its constituent towns. We consider this approach below. A second alternative is to examine if the available data allow one to construct a measure of city-level average wages and then use this measure to see how it varies with city characteristics.
How can the latter be done? While we can observe the scale and density of each town, as well as the industrial composition, thanks to the population and economic censuses, respectively, we do not have a direct measure of wages-the commonly used proxy for productivity in the agglomeration literature. On the other hand, we have data from the LFS carried out around the time of the EC 2005, which contains information on individual earnings and the industry and district of employment. We combine these two sources of information to construct a measure of wages at the town level in the manner described below: Assume that a district d consists of multiple towns c = 1, 2 . . .C d . For an individual i in industry j and town c, the wage can be written as where δ c is a local effect for town c, which we do not observe directly, and ε i is an independent and identically distributed error term. Let N c ≡ j n jc be the total number of firms in town c. It sums across all industries j the number of firms in that industry in town c, n jc . The total number of firms in j in district d is n jd = c n jc . We can also define the share of industry j in town c, s jc = n jc /n jd . Given equation (1), the mean wage in j of district d is We define average wage of town c as y c ≡ j n jc y jd j n jc = j n jc y jd N c Plugging equation (2) into equation (3) and altering the notation a bit, we have the relation between y c and local effects: so that the average wage of town c, computed using the district average wage, is a linear function of the local effects of both town c and other towns in the same district.
Suppose the agglomeration regression we wanted to estimate is δ c = β log S c + μ c , where S is either town population or density. However, δ c is not observed directly. Instead, we can estimate the following regression: , corresponding to the transformation of y c in equation (4).
To implement the procedure described above, we first estimate a wage equation using the LFS data for 2004-2005: where y is daily wage; e, X , and G are years of schooling, labor market experience, and gender, respectively; and γ and φ are industry and district fixed effects, respectively. The average (log) wage of district d and industry j is then computed using the estimated coefficients and fixed effects from the wage equation above and the national average of individual characteristics: 7 Plugging y jd = exp( lny jd ) into equation (3) gives us the town-level wage, y c . Table  2 reports that average y c is Rs83 per day with a standard deviation of Rs20 across cities.
The agglomeration model we estimate is where β measures the elasticity of average wages with respect to the population or density of a city. As y jd is estimated using the national average characteristics of workers, the estimation of equation (6) is unlikely to be biased by more productive workers selecting into large cities. However, an ordinary least squares (OLS) estimation of β is still subject to endogeneity arising when higher wages attract more workers or in cases where city characteristics are missing (Combes, Duranton, and Gobillon 2011). We address these issues by augmenting the regression model with city characteristics such as infrastructure availability, distance to the state capital, education facilities, climate amenities, and district dummies; and by instrumenting city population and density, log S c , with historical population and density, respectively. 8 In addition to derived town-level wages, we also examine the effects of city scale on an alternative outcome that can be directly measured at the town level. The prevalence of modern or formal firms is one such outcome. Manufacturing firms with 10 or more workers (that use electrical power in the production process) are treated as formal sector firms in India as per the Factories Act, 1948. Generally, it seems safe to assume that India's more dynamic and productive firms tend to be larger. They certainly pay higher wages. (See Hasan et al. 2017 for more information on the relationship between establishment size and wages in the Indian apparel industry). The EC 2005 data show that 75% of Indian firms have two or fewer employees and that 98% of Indian firms have 10 or fewer employees. Thus, a majority of enterprises seem to be self-employment ventures that serve as a means of subsistence for families. In view of this, we consider the share of a city's total employment by firms with 10 or more employees as a proxy for the productivity of a city and investigate how this relates to city size and density. 9 Table 2 shows that on average large firms account for 22% of city employment, with a minimum share of 0% and a maximum share of 97%.

A. The Effects of Agglomeration on Wages: District-Level Analysis
In column (1) of Table 3, we report an exercise that replicates Chauvin et al. (2017) to estimate agglomeration effects at the district level. Chauvin et al. (2017) estimate regressions using individual wage data for male workers from districts with an urban population of 100,000 or more, controlling for worker age and level of education. We mimic their sample selection and model specification closely, extending the model to all workers and controlling for additional individual and district characteristics.
As columns (2) and (4) in Table 3 show, when we adopt the same model specification, our agglomeration effects obtained from OLS and IV estimations using 1981 data are qualitatively consistent with those in Chauvin et al. (2017) for both male workers and the sample of all workers. However, when we add additional control variables to the regressions such as individual variables (e.g., labor market experience), industry dummies, and district characteristics (e.g., climate and infrastructure availability) in columns (3) and (5), many of our estimates turn insignificant or even negative. 10 9 The correlation between the employment share variable with the derived average town wages is 0.21 and is significant at the 1% level. 10 We also worked with 1951 data to construct our instrumental variables. However, there are cities and towns with missing population data from 1951 that tend to be smaller (population of between 10,000 and less than 100,000). Since estimates of agglomeration effects could be sensitive to samples with relatively fewer small towns and cities, we dropped 1951 data from the IV construction. (2) and (3) are samples restricted to male workers aged 25-55 from districts with urban population above 100,000.
(4) and (5) are samples restricted to districts with an urban population above 100,000 and include both genders aged 15-65.
(2) and (4) control for state dummies, individual age, and educational attainment by categories.
( 3) and (5)  We also estimate individual wage regressions at the district level separately for manufacturing and service sectors. This exercise allows us to check whether agglomeration effects are the same across economic sectors. The results presented in Table A.4 show that agglomeration effects are stronger for services than for manufacturing. However, similar to the aggregate estimates, both sector-specific estimates are sensitive to additional controls at the district level. In summary, we are able to replicate the main results found in the recent literature by using the same model specification, which suggests that there are significant positive agglomeration effects at the district level in India. Meanwhile, the estimated effects seem quite sensitive to the model specification, which may be partly due to the fact that districts are not well-defined economic units in India. Table 4 presents OLS estimates of the elasticities of wages with respect to city population and population density, applying the methodology described in section IV. Columns (1) and (3) show that, when town characteristics are not controlled for, a 10% increase in town population or urban density will lead to a 3.3%-3.5% increase in a town's average wages. Both estimates are statistically significant at the 1% level. While the estimates fall in the broad range of agglomeration effects found in the literature, they are half the size estimated by Chauvin et al. (2017) for India. One possible driver for the difference could be the different geographic scales examined-a town in our case and a district in that of Chauvin et al. (2017).

B. The Effects of Agglomeration on Wages: City-Level Analysis
Columns (2) and (4) present models controlling for town characteristics. We use data on the annual minimum and maximum temperatures and average rainfall to measure climatic amenities, electricity connections, and paved road length to measure infrastructure availability, distance to state headquarters as a proxy for market access, and the number of educational institutions as a proxy for opportunities to accumulate human capital. The estimated coefficients of log population and log density drop to 2.6%-2.7% when these town characteristics are included in the models, suggesting that there are city features that are correlated with productivity and attract more workers to a city. 11 In addition to addressing the endogeneity concern about omitted city characteristics, the controlled variables themselves are of interest in understanding urban agglomeration. The results show that road length and number of educational institutions have positive and statistically significant effects on urban wages, and the farther away a town is located from the state headquarters, the lower the urban wage. To the extent that the availability of roads and distance to the state center proxy for a town's market access, the results are consistent with the idea that better market access increases firm productivity through higher demand for outputs and cheaper intermediate inputs. The results support the view that local educational resources play a positive role in raising citywide productivity. The results also show that city wages are positively associated with more desirable climates (e.g., higher annual minimum temperature, lower annual maximum temperature, and lower average rainfall). However, none of the correlations are statistically significant.
To further mitigate the endogeneity of city population and density, we instrument these variables with their corresponding values in 1981. Table 5 reports the estimated elasticities of wages with respect to city population and density using instruments based on 1981 data. 12 The upper panel presents the first stage estimates and the lower panel presents the two-stage least squares estimates. Again, we show models with and without town characteristics.
First, the instrumental variables are powerful and robust in explaining contemporaneous city population or density. The IV coefficients are quite stable at around 0.72 regardless of whether it is population or density being instrumented, and whether town characteristics are included or not.
The two-stage least squares estimates of the agglomeration effects are not statistically significant as the estimated standard errors are double those 11 We also added (log) town area to the models and found (i) area has a positive but smaller effect than that of density by one order of magnitude, and (ii) adding area has little impact on the coefficient estimates of density. 12 Using 1951 population or density as instrumental variables yields zero or negative elasticity of wages to city scale. (These results are available upon request.) This may be due to the fact that one-third of towns in the Town Directory of the 2001 Population Census have 1951 population missing. This is less an issue for 1981 since the number of missing values is much smaller. Hence, we focus on results with instruments using data from 1981. of the OLS estimation. Considering coefficient estimates only, the elasticity of wages to population or density drops by about 1 percentage point to around 2% without controlling for town characteristics. When city characteristics that may be correlated with both city scale and productivity are taken into account, the point estimates of the wage elasticity decrease further to 1.4% for population and about 1% for density. In contrast to the results in Chauvin et al. (2017), our IV estimates suggest that the agglomeration benefits in India seem to be quite small and not statistically distinguishable from zero. As far as the estimates of city characteristics are concerned, the coefficients of paved road length, distance to the state headquarters, and number of educational institutions remain qualitatively the same in the IV estimated models as in the OLS estimation. More paved roads and educational institutions lead to higher wages, and being located farther from state headquarters lowers wages. We also include town-level measures of industrial diversity and specialization, share of manufacturing employment, and share of employment from firms with 10 or more employees as control variables and present the results in Table 6. 13 Using OLS, the elasticities of wages to city population and density are estimated at 2.4% and 2.8%, respectively. The IV estimate of wage elasticity to city population is slightly lower at 1.8% and statistically significant at the 5% level; the wage elasticity to city density is around 2%, but is not statistically significant.
All four city-level industry-related variables added to the models are positively correlated with average town wages and are statistically significant, except the share of manufacturing. In other words, town-level wages are higher when towns are both more diversified and specialized, have larger manufacturing sectors, and host relatively larger firms. The estimated coefficients of city characteristics do not change qualitatively although they are often smaller in magnitude and weaker in terms of statistical significance. This may be because the added town-level industrial variables pick up the effects of these city characteristics.

C. Alternative Approach
We have argued that urban agglomeration effects should be assessed at the city level because districts are not the proper unit for defining cities in India. Since we do not observe wages at the firm or city level in the EC 2005 data, we construct the city-level wage variable based on the average district-industry wages and the distribution of industries across towns and sectors. Our estimation shows that Indian cities have generated limited agglomeration effects that are smaller than the estimates obtained at the district level. Ciccone and Hall (1996) offer another approach to address the problem that a measure of productivity such as wages may be missing at the city level. They are interested in examining the extent to which agglomeration economies at the county level in the United States may be driving state-level gross domestic product per worker. Essentially, they construct a density index at the state level as a nonlinear function of density of constituent counties. Recognizing that the state-county relationship is similar to our district-city relationship, their approach entails estimating the following regression: where y d is an aggregate productivity measure (average district wage in our case), e d is aggregate education of labor force, and D d (θ ) is the density index function defined as where N d is employment in district d, a c is the area of constituent city c, and D c is the density of city c. θ is the key parameter to be estimated, with θ > 1 implying a positive agglomeration effect. Note that function D d (θ ) is distinct from the multiplication of log of linear average of city density and θ, which is commonly used in district-level agglomeration models. We apply the Ciccone and Hall (1996) approach and obtain θ = 0.94 with a standard error equal to 0.031. This suggests that there is unlikely to be appreciable agglomerations effects among Indian cities. We consider this result as being supportive of our results reported in section V.

D. Employment Shares of Formal Firms
As noted in the introduction and in section IV, we also estimate how employment shares of firms above a certain scale vary in response to city population and density. The main rationale is that these firms are more productive than microenterprises and hence the share of employment accounted for by these firms is a proxy for the average productivity of the city. As noted earlier, we choose 10 employees as the cutoff point to define scale-in line with various regulations governing the industrial workplace and trade union activity-and calculate the share of total employment that is hired by firms with 10 or more employees for each town. Table 7 presents OLS estimates of the coefficients of log city population and density on the employment share of firms with 10 or more employees. First, city population and density show distinct effects on the dependent variable. When town variables are not controlled, urban population has a significantly positive impact on the employment share, while urban density has a small positive effect that is not distinguishable from zero. When city size doubles, the share of firms with 10 or more employees increases by 2.9 percentage points, or by about 13% as the average share is 22%. The impact of city size falls to about 1% and is statistically insignificant when town characteristics are controlled for in the model. On the other hand, density has a negative and statistically significant effect on the presence of the relatively large firms when town characteristics are accounted for. When urban density doubles, the employment share of the relatively large firms decreases by nearly 1 percentage point.
One plausible interpretation of the results is that firms benefit from better input-output linkages, thicker local labor markets, and improved learning in larger cities. However, instead of using the family house or backyard, formal firms need relatively large tracts of land in appropriately zoned areas to operate on. When city density increases, land and property prices may be expected to rise and outweigh the agglomeration benefits of city size, and thus bid some firms out of the market. Indeed, the relatively rapid growth of manufacturing in peri-urban and rural areas documented by Ghani, Goswami, and Kerr (2012) lends support to this possibility. While this "spread" of manufacturing is in some ways equalizing and natural, it may well lead to suboptimal economic outcomes if it undercuts economies of agglomeration and leads to locational decisions that discount proximity to input suppliers and markets-features that are widely believed to be key drivers of sustained growth of manufacturing and associated service industries, and thus the creation of modern jobs. 14 Similar effects may also occur in the service sector more generally.
The coefficients on the city characteristics show that there is more employment in formal firms if the city has better transport infrastructure, as measured by paved road length, and more educational institutions. These estimates are qualitatively consistent with the models with derived average city wages as the 14 As Henderson (2014, 9) notes, "We have painted decentralization here as a positive development. However, there may be premature decentralization in India and a loss of agglomeration benefits. Firms are driven from cities because of poor environments: poorly allocated infrastructure investments, a lack of public utilities, and inappropriate land market regulations. Bertaud and Brueckner (2005) discuss how land market regulations (limiting floor-area ratios) in Mumbai have led to sprawl and inefficiently low densities near the city center. This may result in a costly lessening of agglomeration benefits." dependent variable, implying that two dependent variables do measure something in common. Table 8 reports IV estimates for the models of employment shares of relatively large firms. The results are essentially the same as the OLS estimates. City population has a significant positive effect-with a coefficient estimated at 0.03on the shares of these firms when 1981 population is used as the instrument variable. When city characteristics are included, the estimated population effect of 1.4% is slightly higher than the OLS estimates and is significant at the 10% level. When city density replaces population and town characteristics are taken into account, the IV estimates are similar to the OLS estimates, confirming that higher city density leads to fewer large firms. This is consistent with the possibility that higher land and property prices resulting from increased urban density can outweigh the benefits of urban agglomeration and drive some productive enterprises out of the local market. This does not necessarily contradict the finding that higher density is associated with higher local wages in that those driven out of the dense cities are likely to be less productive among the formal firms.
The estimated coefficients of city variables are largely consistent with their OLS counterparts. In particular, the paved road length and number of educational institutions show robust positive effects on the presence of formal firms. Notably, the results suggest that larger firms are more likely to be located in cities closer to the state headquarters.
From Table 9, we see that the above results (e.g., the positive effects of city population and negative effects of city density on the employment shares of larger and formal firms) hold when a city's diversity, specialization, and employment share of manufacturing are added to the models. Not surprisingly, the specialization index and diversity measure are positively and negatively correlated with the dependent variables, respectively. However, our estimates show that the greater the share of manufacturing employment in a city, the smaller the share of employment from formal firms, though the relationship is not statistically significant.

VI. Conclusions
To the best of our knowledge, this paper provides the first evidence of agglomeration effects in India at the city or town level. We examine two outcome variables: (i) a measure of average town wages and (ii) employment shares of formal firms (firms with 10 or more employees). The former is derived from individual earnings information provided in the LFS and the sectoral distribution of employment at the city level, while the latter is computed with micro data from the EC 2005. We address potential endogeneity issues by including relevant city characteristics and instrumenting the contemporaneous city population and density with historical values.
We find positive though small elasticity of wages with respect to town population and density. The estimates we find most credible suggest that the average wage will increase by 1%-2% if the city size or density doubles. This agglomeration effect is smaller than those from the recent literature (e.g., Chauvin et al. 2017) and from our replication exercise, which estimates the effect at 7%-8%. We think the different unit of analysis-cities in our case and districts in the literature-drives some of the difference in the estimates. We also find that city size has a positive effect and density has a negative effect on the presence of formal, or relatively large, firms. This suggests that agglomeration benefits for some firms may be offset by rising land and property prices as cities become more dense.
Our results on city characteristics generally conform to the literature. Higher city productivity is associated with better infrastructure and geographic location, as well as opportunities for accumulating human capital through schools. We also find that industrial diversity and specialization and the employment share of manufacturing in a city are all positively associated with city wages. From a policy perspective, our results serve to provide a note of caution in line with the arguments of Ahluwalia, Kanbur, and Mohanty (2014) and Henderson (2010). While cities are widely believed to be engines of growth and good jobs, the mere fact of an expansion in urban agglomerations does not necessarily mean that agglomeration benefits will flow automatically. Perhaps cities require a certain level and type of planning and infrastructure investments to play the beneficial role many economists and policy makers have come to expect of them. Without such planning and infrastructure investments, Indian cities may well fail to capture agglomeration benefits.
In closing, the use of Economic Census data for research on policy issues is relatively new. With its universal coverage and the location of firms at the town level, the data allow us to study urban economic topics in a developing country context in considerable depth.
industry j in city c's total employment, and s j is industry j's share in national total employment. Calculations for these indexes are based on a 59-industry breakdown of the two-digit National Industrial Classification 2004.        [2004][2005].