Evolution of the Size and Industrial Structure of Cities in Japan between 1980 and 2010: Constant Churning and Persistent Regularity

This paper investigates the evolution of the Japanese economy between 1980 and 2010 with regard to the population and industrial structure of cities. With the rural-to-urban transformation settling by the 1970s, Japan experienced the second stage of urbanization through the integration of nearby cities. This led, on average, to a disproportionately high population growth rate of 24% for a set of core cities during the review period. At the same time, cities experienced substantial changes to their industrial composition: on average, 35% of the manufacturing industries (at the 3-digit level) present in a city in 1980 had left by 2010, while 30% of manufacturing industries located in a city in 2010 had not been present in the same city in 1980. Remarkably, this substantial relocation of populations and industries among cities took place while a simple yet rigid relationship between the size and industrial composition of cities was preserved, characterized by the roughly constant elasticity between the number and average size of cities in which an industry was present. This paper discusses the policy implications of this persistent regularity and the possible underlying mechanisms.


I. Introduction
The 30-year period between 1980 and 2010 studied in this paper coincides with a period of major economic upheaval in Japan. Recovering from the economic turmoil caused by two oil shocks and the end of the fixed exchange rate system in the 1970s, Japan experienced moderate growth triggered by major financial reform policies and public infrastructure investment during the first half of the 1980s. This was followed by the forming of an asset price bubble, its bursting in 1991, and the long economic stagnation that has come to be known as "the lost 20 years." The rise and fall of individual cities, however, do not reflect these alternating developments at the national level. 1 Rather, their relative locational advantages, as determined by the expanded highway and high-speed railway networks established during the growth period of the 1970s and 1980s, appear to have a long-lasting impact on the fate of individual cities and regions. As will be shown in section II, more than 30% of the variation in population growth among individual cities between 1980 and 2010 is related to transport development. While the development of the national transport network enhanced the accessibility of all locations in Japan, those cities that were associated with major transport hubs, intersections, and terminals gained even more, which in turn attracted industries and migrants to these cities, resulting in their disproportionate growth. These select cities experienced an average population growth rate of 24% during the period by swallowing surrounding cities. Their growth was accompanied by a decrease in the number of Japanese cities during the review period from 309 to 221.
Cities do not result solely from population agglomerations, but are usually associated with industrial agglomerations. As will be discussed in section IV.B, it is natural to characterize the industrial structure of a city in terms of the absolute, rather than the relative, presence of each industry; that is, in terms of the agglomerations rather than specialization. In this study, the significant presence of each of the 110 manufacturing industries (3-digit level) of the Japanese Standard Industrial Classification is identified by using the agglomeration-detection approach developed by Mori and Smith (2014), so that the industrial composition of a city can be defined by the set of industries whose agglomerations overlap with this city. 2 In this context, a larger city naturally houses a larger number of industries. However, the list of industries present in a city appears to change drastically over the 30-year review period even if the number of industries changes little. More specifically, the cities included in our data experienced substantial churning in their industrial composition: among the cities that remained throughout the 30-year period, an average of 35% of the 3-digit manufacturing industries present in a city in 1980 had left by 2010, while an average of 30% of the industries present in a city in 2010 had not been located in the same city in 1980. 3 Even among rapidly growing cities, 1 The definition of cities is provided in section II. 2 As industrial classifications vary from year to year, only the set of industries that appear throughout the review period are included to make comparisons between different years meaningful. 3 Our interest here is the range of industries in a given city. Thus, it is natural to look at the presence of industries. Alternatively, Duranton (2007) proposes a measure of industrial churning based on employment distribution across industries. Then, the evaluated "churning level" reflects the variation of labor intensity and production scales across industries. For instance, the disappearance of labor-intensive industries will be more pronounced than that of capital-intensive ones. Also, while a large number of employees for a given firm usually industrial diversity did not simply expand during the review period, but rather a sizable portion of the industrial composition of each city was replaced.
Despite the substantial relocations of both populations and industries among cities, the size distribution and the relative industrial composition of cities remained remarkably similar over the period studied. As for the former, the upper tail of the city-size distribution exhibits the persistence of the power law coefficient. As for the latter, the hierarchy principle held to a large extent; that is, the set of industries present in a smaller city is the subset of that in a larger city. More specifically, it is possible to identify, on average, roughly 70% of the industrial composition of a given city based on this principle despite the frequent churning of the industrial composition of cities. When taken together, these two regularities imply a persistent log-linear relationship between the number and average size of cities in which a given industry is present, which is designated as the number-average size rule (Mori, Nishikimi, and Smith 2008;Mori and Smith 2011).
Economic policies at the city and regional levels are often targeted at population growth and promotion of industries. However, in the presence of persistent regularities between size and the industrial structure of cities, such policies may have little influence. This is because the relative size of a city is largely dictated by the persistent power law, and the relative position of a city in the city-size distribution, in turn, dictates the industrial composition of the city via the hierarchy principle.
The rest of the paper proceeds as follows. Section II describes the evolution of the sizes and locations of cities in Japan between 1980 and 2010, and demonstrates how the nationwide development of a high-speed transport network in the 1970s and 1980s had a substantial impact on the growth patterns of cities. Section III lays out the evidence for the churning of the industrial composition of cities with a focus on manufacturing industries. Section IV shows evidence of the persistent regularities in city-size distribution and the relationship between the size and industrial composition of cities. Section V discusses the implications of this study for theoretical modeling and regional economic policies. Section VI concludes the paper.

II. Growth and Decline of Cities
The spatial structure of the economy in a given country can perhaps most precisely be described in terms of the city system. Cities here refer to metropolitan areas that are combinations of working and residential places connected by commuting ties, rather than municipalities as delineated by implies a larger output value for that firm within a given industry, this is not often true when firms in different industries are compared. Thus, his measure is somewhat misleading as a measure of industry churning. To gauge the range of industries in a city, this paper adopts the count of industries that have a statistically significant presence (agglomeration) in a city.

Figure 1. Population Growth Rates of Cities, 1980-2010
Source: Author's calculations based on University of Tokyo, Center for Spatial Information Science. Urban Employment Area. http://www.csis.u-tokyo.ac.jp/UEA/index_e.htm jurisdictional boundaries. Metropolitan areas are usually identified as a bundle of municipalities for which population and commuting data are available. In this study, we adopt the definition of an Urban Employment Area as proposed by Kanemoto and Tokuoka (2001), which is the most popular definition of a metropolitan area in Japan. 4 Hereafter, we will use cities and metropolitan areas interchangeably. For simplicity of analysis, this study covers only the cities on islands that are connected to either Honshu (Japan's main island) or Hokkaido by road. Thus, isolated small islands are omitted. 5 Cities are defined as endogenous and their boundaries vary over time in response to changes in commuting patterns. Hence, not only can the boundaries of existing cities expand or contract, but new cities may also form as old cities disappear.
By 1980, the rural-to-urban transformation in Japan had almost reached completion and the country's inhabited lands were saturated with cities that accounted for 87% of the total population. There has been not much change in the urbanization rate since then. Nonetheless, cities exhibited substantial volatility in population between 1980 and 2010 as indicated by the distribution of population growth rates for this period in Figure 1. Of the 309 cities that existed in 1980, 114 were either absorbed into other cities or simply disappeared during the review period, while 26 new cities were formed, leaving 221 cities in 2010. The cities that

. Development of High-Speed Railway Networks
Sources: For municipal boundaries, Takashi Kirimura. Municipality Map Maker. http://www.tkirimura.com/mmm/; for high-speed railways networks, National Land Numerical Information Service of Japan. http://nlftp.mlit.go.jp/ksj -e/index.html were present in both 1980 and 2010 experienced, on average, population growth of 24% during the review period (with a standard deviation of 47%).
As mentioned above, the 1970s and 1980s saw the extensive development of highways and high-speed railway networks throughout Japan. Figure 2 depicts the development of the high-speed railway network, with the cells in the background representing cities in 2010. The development began in the section linking Tokyo and Osaka in 1964 (the red segment in the figure), which was triggered by the Tokyo Olympics in the same year. The network first expanded westward to Fukuoka in 1975, and then eastward to Morioka, and northward to Niigata in 1982. The rest of the extensions came mostly after 2000. Thus, the network structure in the 1980s seemed to have the strongest influence on the growth of cities in the 1990s and 2000s.
The development of highway networks followed that of high-speed railways in the 1970s and 1980s as shown in Figure 3. From the 1990s, the network expanded to cover the more rural parts of the country.
The development of these transport networks through the 1980s seemed to play a key role in the drastic rise and fall of cities. Figure    Source: Author's calculations based on University of Tokyo, Center for Spatial Information Science. Urban Employment Area. http://www.csis.u-tokyo.ac.jp/UEA/index_e.htm the 9% growth rate of the national population over this period. The larger cities labeled in bold, italic, and normal fonts are located at the site of major high-speed rail stations, highways, and 24-hour airports, respectively.
For all types of transport networks, Tokyo emerges as a major terminal. To travel from the west of Tokyo to either the east or north of Tokyo by high-speed railway, passengers must change trains in Tokyo, which contributes to the locational advantage of the city. As shown in Figure 3, highways expand radially from Tokyo, indicating that transport network developments in Japan are highly biased to improve accessibility to and from Tokyo.
There are other obvious examples of high growth rates at the major terminals and intersections of the networks. Shikoku island, one of Japan's four major islands, was connected to the main island of Honshu for the first time in 1988 via Okayama. Thus, Okayama became a gateway for the Shikoku region, which had the effect of doubling the population of the Okayama metropolitan area from 750,188 to 1,532,146 between 1980 and 2010. 6 A similar advantage boosted the population size of Sumoto (as indicated in Figure 3). Kitagami has grown as a key intersection of the highway and high-speed railway networks. Fukuoka has been a terminal of the high-speed railway since 1975, and together with its 24-hour international airport, has emerged as a gateway to Kyushu island (as indicated in Figure 2). 7 Utsunomiya and Takasaki-Maebashi are located at the origin of the eastward and northward high-speed railway lines from Tokyo, respectively. Toyama, located on the northern coast of Shikoku, experienced substantial population growth from 504,353 to 1,093,247 between 1980 and 2010 (as indicated in Figure 4). This can be explained by the completion of a highway route linking Tokyo and Osaka via Toyama in 1990. The locations of major airports have also boosted the city's population. In particular, the airport at Chitose boosted the population of nearby Sapporo.
While the expansion of transport networks improves the accessibility of all cities along the network, the effects are not uniform. Typically, locations near major hubs tend to lose relative locational advantage to the hubs, and hence lose population and industries. This phenomenon is often called the "straw effect" (see, for example, Behrens et al. 2009), meaning that the growth potential of a location is overwhelmed by the nearby location with a better advantage. 8 Typical examples of declining cities owing to straw effects are Kure and Kitakyushu located along the high-speed railway line leading to Fukuoka (Figure 2). Both were major industrial cities, with the latter being the eighth largest city in Japan in 1980. However, their transport accessibility declined substantially compared with Fukuoka at the west end of the network. The port city of Kamaishi also lost its locational advantage as it was not located on either the highway or the high-speed railway network.
The discussion above on the impacts of transport development on city size can be confirmed by regressing the (log of) a city's population growth rate between 1980 and 2010 on a range of dummy variables, each of which represents the presence of a transport network node and the growth rate of the number of local railway lines connected to the city: lnPOP c on the left-hand side represents the log of the population growth rate of city c (between 1980 and 2010), where M ࣕ {a high-speed railway station, a high-speed railway terminal, a location along the highway, and a node of Shikoku link} represents the set of transport advantages and Y ࣕ {1970, 1980, 1990, 2000} represents the set of time points, so that δ m c,t = 1 if city c is at the node of type m which is active in year t, and zero otherwise. AIR(24hrs) c [AIR c ] equals 1 if there is an airport that is [is not] in 24-hour operation within 50 kilometers (km) from city c in 2010 but not in 1980, and zero otherwise. 9 lnRAIL c is the log of the growth rate of the number of local railway lines that have stations in the city; and lnPOP c,1980 is the log of the population size of city c in 1980. Finally, ε c is an error term. Table 1 summarizes the result of the regression using an ordinary least squares (OLS) estimation. Cities at terminal locations of the high-speed railway in 2000 (Fukuoka and Tokyo) and at the gateway to Shikoku island in 1990 (Okayama) experienced particularly high growth. Although less spectacular, expansion of the high-speed railway network into the eastern and northern regions of Honshu in the 1980s contributed to the growth of cities in these areas, which is reflected in the significant coefficient of the 1990 dummy for high-speed railways. The effects of new transport links do not necessarily show up immediately after their completion. The significant terminal effects at Fukuoka and Tokyo in 2000 may reflect the fact that the network was so extensive by 2000 given the nationwide expansion of high-speed railways in the 1980s and 1990s.
An interesting contrast between the impact of the high-speed railway network and that of the highway network is that while a highway connection leads to significant population growth in a city at every point in time, this is not the case for high-speed railways. Unlike highways, which are less closely associated with mass transportation, in the case of a high-speed railway network, the frequency and connectivity of transport services can differ greatly between major intersections and terminals and local stations, and hence the realized gains in transport accessibility can differ accordingly. The differential improvement in accessibility along the high-speed railway network resulted in nonuniform population growth rates across different cities. An example is the comparison between closely located Fukuoka and Kitakyushu (refer to Figure 2). Kitakyushu is an older industrial city that flourished during the high-growth period of the 1960s. However, when the high-speed railway line terminated at Fukuoka rather than at Kitakyushu, it appeared to have a permanent impact on the growth paths of these two cities whose population sizes were comparable in 1980.
Fukuoka was already the sixth largest city in Japan in 1980 with a population of 1,773,129. Its population expanded by 41% over the next 3 decades to 2,495,552, making it Japan's fifth largest city in 2010. Meanwhile, Kitakyushu's population fell from 1,524,747 to 1,370,169. These nonuniform impacts are partly responsible for the insignificant coefficients of high-speed railway dummies. Furthermore, Tokyo is a unique terminal connecting high-speed railway lines in virtually all directions, which should partly explain the disproportionately large population of Tokyo today.
The estimated coefficient of the lagged population size in equation (1) is negative and significant (−0.078), meaning that it is more difficult for a larger city to attain the same growth rate under a given transport development. In fact, transport developments account for more than 30% of the variation in the growth rates of cities after removing the effect of initial population. Regressing lnPOP c + 0.078 lnPOP c,1980 on the rest of the variables yields R 2 = 0.40 (adj. R 2 = 0.34). Of course, the quantification of the underlying causal effects should involve appropriate instruments for transport developments as in Faber (2014), which is beyond the scope of the present paper. 10

III. Churning of Industrial Composition of Cities
It was not only populations but also industries that were substantially shuffled during the review period. To show this, we adopt the cluster-detection method developed by Mori and Smith (2014) to identify the set of industries that have agglomerations in each city. This method, based on a simple probabilistic model of establishment location behavior, identifies the set of agglomerations-the significant spatial clusters of establishments-for each industry in terms of the partition of the set of basic regions. Each basic region is a 10 km × 10 km cell and the entire set of the basic regions consists of 3,735 cells covering the locations in Japan included in this study. 11 The key features of this approach include (i) filtering out insignificant clusterings of establishments, (ii) determining the spatial extent of each individual agglomeration, and (iii) jointly identifying the set of all agglomerations of a given industry in a statistically consistent manner. 12 To highlight the churning phenomenon, we restrict the set of industries to those that appear throughout the period being studied. Those industrial categories that do not appear in all years are excluded from the analysis. For instance, the 2-digit category of "electronic parts, devices, and electronic circuits" was introduced in 2006 and therefore the industries in this category, by definition, did not exist in earlier years. However, even if these "new" industries are excluded, and similarly if the "old" industries that are only present in earlier years are excluded, the churning of industries happening across cities is remarkable. This leaves us with a set of 110 3-digit manufacturing industries between 1980 and 2010. 13 The spatial distribution of establishments and that of identified agglomerations for two sample industries, livestock products and leather gloves and mittens, in 2010 are shown in Figures 5 and 6, respectively. The former is a relatively more ubiquitous industry and the latter is a relatively more concentrated 10 While there are several recent attempts of measuring the impacts of transport development on regional growth (see, for example, Duranton and Turner 2012and Baum-Snow et al. 2017, 2016, few of them are successful in incorporating the nonuniform effects of transport development such as straw effects. Faber (2014) is a remarkable exception. Thus, there are more issues than endogeneity, and this literature is still subject to further refinements. 11 The basic regions in 1980 comprise 3,363 municipalities by 2001, rather than 10 km-by-10 km cells, due to data availability. 12 Except for the regional divisions adopted, the rest of the setup for agglomeration detection follows that in Mori and Smith (2014). 13 Industry categories containing miscellaneous 3-digit industries for which no specific 3-digit categories could be assigned are also excluded.  In this context, the industrial composition of city c can be represented by the set of industries present in the city, designated by I c , and the industrial diversity of city c can be represented by the number of industries present in the city, |I c |. The industrial diversity varies significantly across cities as depicted in Figure 8. While the mean diversity is 45.51 (46.68), the range is almost full; that is, from 0 to 110 (0 to 106) with a standard deviation of 27.27 (27.1) in 2010 (1980).
Not surprisingly, the industrial diversity of each city is highly persistent: the correlation between industrial diversity in any two different points in time is greater than 0.9; in particular, the correlation between industrial diversity in 1980 and 2010  Alternatively, the churning of industries across cities can be quantified in terms of the Jaccar index, J c (s, t), for city c between years s and t, computed as the ratio of the size of intersection to the size of union of the sets of industries for city c in years s and t. Figure 11 shows the distribution of J c (1980,2010) where the average of the index value is 0.5 with a standard deviation of 0.19. As is consistent with the entry and exit shares above, if a city has the same industrial diversity in 1980 and in 2010, then an average of one-third of the industrial composition has been replaced during the review period. It is worth noting that the churning is substantial even for shorter time periods. That is, average values of J c (1980,2000) and J c (2000,2010) are 0.53 and 0.6, respectively. 15

IV. Persistent Regularity in Size and Relative Industrial Composition of Cities
Despite the substantial churning of populations and industries across cities during the review period, there are persistent structural regularities related to the population distribution and relative industrial composition of cities. Below, each of these regularities is discussed in detail.

A. Power Law for City-Size Distribution
City-size distributions are known to be well approximated by power laws across a wide range of countries (see, for example, Gabaix and Ioannides 2004). If each city size, s, is treated as a realization of a random variable S with distribution P, then S is said to satisfy an (asymptotic) power law with exponent κ, if and only if for some positive constant a: which can alternatively be expressed as In this study, this relationship is referred to as a power law. If a given set of n cities is postulated to satisfy such a power law, their sizes are distributed as shown in equation (3), and if these city sizes are ranked as s 1 ࣙ s 2 ࣙ · · · ࣙ s n , such that the rank r c of city c is given by r c = c, then it follows that a natural estimate of P(S > s c ) is given by the ratio c/n ࣕ r c /n. Thus, using equation (3), we obtain the following approximation: where b ࣕ ln(an)/κ. This explains the standard log regression method for estimating κ in terms of the rank-size data, [ln(r c ), ln(s c )] for c = 1, …, n. Figure 12 plots ln s c versus ln r c for all cities, c, in the years 1980, 1990, 2000, and 2010. As discussed in section II, cities with a relative locational advantage have grown by absorbing surrounding cities. The number of cities fell from 309 in 1980 to 283, 261, and 221 in 1990, 2000, and 2010, respectively, while the average population size of cities increased from 330,075 in 1980 to 391,038, 443,237, and 545,745. However, the approximate log-linearity of this rank-size data in each year shows that Japanese cities appear to exhibit a power law with exponent κ, given roughly the reciprocal of the slope of each curve.
However, it is also clear that if one estimates this slope with OLS, the downward bend in this curve for small cities will tend to produce a slope estimate that is too steep, implying that the estimateκ of the exponent κ will be too small. One of the simplest methods for correcting this bias, as proposed by Gabaix and Ibragimov (2011), is to reduce the rank scale by 0.5, which yields the following modified regression: with θ = 1/κ. The estimated power law coefficientsθ are 1.07, 1.08, 1.09, and 1.13, in the years 1980, 1990, 2000, and 2010, respectively. Compared with the sizes of individual cities, these values are far less volatile. The modest increase in the power coefficient is consistent with the growth of large cities through the absorption of small ones.

B. Hierarchy Principle and the Number-Average Size Rule
There is a strong relationship between the size and industrial composition of cities. Following Mori, Nishikimi, and Smith (2008) and Mori and Smith (2011), if the cities in which industry i is present are designated as choice cities of industry i, then the number of choice cities of industry i, n i , and the average population size,s i , of these cities have a log-linear relationship, which is known as the number-average size (NAS) rule, as indicated by the "+" plots in Figure 13 for all 110 3-digit manufacturing industries in 2010.
The figure shows the NAS plot as well as two curves: the upper-average and lower-average curves. The former is the average population size of the largest cities and the latter is that of the smallest cities for the number of cities given by each point on the horizontal axis. Thus, these curves define the upper and lower bounds for the NAS plots. The NAS plots almost hit their upper bound, meaning that the choice cities for each industry roughly consist of the largest cities. This, in turn, implies that the industrial composition of cities exhibits a strong hierarchical relationship such that the set of industries present in a smaller city is roughly the subset of that in a larger city. When this hierarchical relationship holds, the industrial compositions of cities are said to satisfy the hierarchy principle (Christaller 1933). Furthermore, the approximate log-linearity of the upper-average curve reflects the asymptotic power law for city-size distribution. 16 The log-linear slope is fitted to the NAS plots for each year depicted in Figure  14 using an OLS estimation as follows:  As stated above, on average, one-third of the industrial composition of a city was replaced between 1980 and 2010. It is remarkable that this churning of industries took place while maintaining a common power law.
In the context of central place theory, the agglomeration of population and that of industries reinforce each other. Figure 15(a) shows evidence supporting this implication by plotting the log of the population growth rate versus the increase in the industrial diversity of each city i, |I i |, for cities that experienced increases in industrial diversity. The correlation is 0.32 and is significant. Thus, an increase in the number of industries coagglomerating in a city accompanies an increase in the city's population. Figure 15(b) shows the similar plots for cities that experienced decreases in industrial diversity. The correlation between the change in population size and that in industrial diversity for these cities is insignificant. This may be owing to the inertia of population agglomeration, such that shrinking industrial diversity does not immediately translate into reduced population agglomeration. The extra workers are likely to be absorbed in the nonmanufacturing sectors. The period studied coincides with the period of deindustrialization. The share of manufacturing in the total establishment counts decreased by 33% from 13.5%; while wholesale, retail, and services increased by 22% from 49.4%; financial services increased by 20% from 1.3%; and transport and information increased by 55% from 2.5%. A similar trend can be observed in employment.
Finally, we look at quantifying the hold of the hierarchy principle. The more complete the hierarchy principle is the fewer are the degrees of freedom for influencing the industrial composition of a given city by independent place-based policies since the industrial composition of this city is more closely linked to the rest of the cities.
Let I c be the set of industries present in city c and the hierarchy share between cities c and d be defined by Then, the hierarchy principle implies that H (c,d) = 1 for cities c and d, such that s c ࣘ s d . If the set of hierarchy pairs of cities in the set C of all cities is defined as H ࣕ {(c, d): c, dε C, s c ࣘ s d }, then the degree to which the hierarchy principle holds can be quantified by The value of H is, on average, 0.7 (0.24), 0.73 (0.2), and 0. 67 (0.24) in 1980, 2000, and 2010, respectively, where the numbers in parentheses are standard deviations. Thus, it can be said that roughly 70% of the realized industrial location patterns are consistent with the hierarchy principle.
To test the significance of these values of H, we construct the counterfactual industrial composition of each city as follows. First, the industrial diversity of each city c is fixed at the actual value |I c |. Then, for each city c, the counterfactual industrial composition is chosen by selecting |I c | industries without replacement from the set of all industries, I, with the choice probability of each industry i ϵ I being n i / jϵI n j . By controlling for both the industrial diversity of cities as well as the locational diversity of industries, we generate 1,000 random counterfactual location patterns of industries. Figure 16 depicts the distribution of average hierarchy shares under random counterfactual location patterns of industries (the gray histogram) together with the average hierarchy share under the actual location pattern of industries in 2010. The p-value for the one-sided test under the null hypothesis-that the actual location pattern of industries is an instance of the random counterfactual location patterns-is virtually zero. Thus, the actual location patterns exhibit strong consistency with the hierarchy principle. The same is true for all other years during the review period.

V. Implications for Regional Economic Policies and Theoretical Modeling
The stringent spatial coordination of industries and population has been observed at each point in time during the review period (1980, 1990, 2000, and 2010), despite the substantial spatial churning of population and industries. The presence of these regularities has strong policy implications in that they act as a constraint on policies aimed at giving an economic boost to an individual city or industry. While there are no compelling theories to account for the observed regularity formation at this point, it is necessary to discuss potential mechanisms underlying these regularities to derive any policy relevance. The next section reviews some relevant theoretical developments before discussing their policy implications.

A. Theoretical Foundation of the Persistent Regularities
There are two key regularities that are related: the power law for city-size distribution and the hierarchy principle for the industrial composition of cities. Under the perfect hierarchy principle, it is not simply that a larger city has more industrial diversity than a smaller city, but also that larger cities have the entire range of industries that are present in any smaller city. Since smaller cities are more ubiquitous (under the power law), any industry found in a larger city but not in a smaller city tends to be more localized (a smaller number of choice cities) than any industry found also in the smaller city.
Among current theories, one approach to account for the hierarchy principle is in terms of central place theory (Christaller 1933;Fujita, Krugman, and Mori 1999;Tabuchi and Thisse 2011;Hsu 2012;. The central tenets of this theory assert that the heterogeneity of industries and the positive (demand) externalities across industries, together with the spatial extent of markets, give rise to the hierarchies of cities in terms of their industrial composition, and thus to a diversity of city sizes. In the spatial competition model of Hsu (2012), the difference among industries is only the scale economies in terms of the size of fixed costs. He shows that if the distribution of the size of fixed costs can be represented by a regularly varying function, then there is a locally stable equilibrium in which the hierarchy principle holds, city-size distribution exhibits a power law, and the NAS rule holds.  instead adopt a multi-industry, multilocation extension of the new economic geography model of Fujita, Krugman, and Mori (1999) and Pflüger (2004). In this model, each industry produces a continuum of differentiated consumption goods (Dixit and Stiglitz 1977), subject to a given substitution elasticity of goods, while there are a large number of such industries that differ in the elasticity of substitution. To determine the distribution of substitution elasticities across industries, Akamatsu, Mori, and Takayama (2017) took a sample from the estimated substitution elasticities of more than 13,000 imported products in the United States by Broda and Weinstein (2006). They generated a large number of bootstrapped samples of stable equilibria of the model economy and showed that the hierarchy principle, the power law for city-size distribution, and the NAS rule are generic properties of stable equilibria.
As for the power laws for city-size distribution, the most popular theoretical derivation postulates that growth rates of individual cities are independently and identically distributed random variables (Gibrat 1931). 17 However, these models do not simultaneously account for the hierarchy principle, or at least it is not explicitly obtained. The key difference between these random growth models and the central place models presented above is the absence and presence of space. The heterogeneity of industries and the demand externalities across industries do not automatically result in the hierarchy principle. 18 The power law for city-size distribution depends critically on the shape of the distribution of substitution elasticities as suggested by . Although the observed distribution of substitution elasticities (or more generally that of scale economies) based on the Broda and Weinstein (2006) data is consistent with the power law, the underlying mechanism is still an open question.

B. Policy Implications
It has been shown that the size and industrial structure of cities in Japan have maintained tight regularities despite the substantial churning of population and industries across cities over time. 19

Persistent Power Law for City-Size Distributions
It has been shown that the size distribution of cities exhibits a persistent power law over the 30-year review period (section IV.A) despite substantial reorganizations of the city structure through the integration of nearby cities as well as the redistribution of populations among cities (section II).
Evidence presented in section II suggests that the latter redistribution of population is partly accounted for by the uneven improvement of transport accessibility brought about by the nationwide expansion of high-speed railway and highway networks. The objective of this public infrastructure investment was to correct regional disparities and carry out balanced development under the Government of Japan's Comprehensive National Development Plan. Although larger population sizes do not necessarily imply higher welfare, it is indeed often the case (see, for example, Bettencourt andWest 2010, Bettencourt 2013).
However, the persistence of the power law exhibited by the city-size distribution implies that the regional disparity is unlikely to disappear in the wake of such policies since cities can grow only at the cost of other cities so that the distribution of the relative size of cities is preserved, although individual cities may experience adjustments in their size rankings. Better accessibility between the core and peripheral regions does not necessarily induce growth in the peripheral regions, although it was the original intention of Japan's development policy. In fact, postwar transport network development in Japan has always favored Tokyo as the network was essentially designed to improve accessibility to Tokyo, which in turn has made Tokyo even more disproportionately large rather than helping smaller cities to catch up. In practice, the feasible way to achieve equality among regions may be through interregional transfers.

Hierarchy Principle
When industrial location is considered within the city system, the computation of hierarchy shares in section IV.B indicates that roughly 70% of the location patterns of individual industries are consistent with the hierarchy principle at each point in time. Given the substantial churning of industries among cities (an average of 30%), this percentage is quite high, which in turn suggests that coordination takes place relatively instantaneously. From the observed NAS rule and the theoretical models discussed in section V.A, the spatial coordination of industries and the prevailing heterogeneity of scale economies among industries may be responsible for the realized persistency of the power law for city-size distribution. If the hierarchy principle is an outcome of the spatial coordination among many industries via cross-industry positive externalities as in the central place models, then there seems to be little room for any policy by an individual city or region to have a large influence on the location behavior of a specific industry.
At the same time, it is also true that as many as 30% of the realized industrial locations deviate from the hierarchy principle. Since the location patterns of these industries are relatively independent, other things being equal, place-based policies targeting these industries should be more effective. Figure 17 shows the distribution of the counts of industries which are present in a given city but not present in more than 70% of cities that are larger than this given city. The figure shows only the cities that have at least one such industry deviating from the hierarchy principle, where the darker colors represent the presence of a larger number of such industries. The names of cities together with the names of deviating industries in parentheses are indicated for selected cities. The place-based policies targeting these industries in a given city may be less constrained by the spatial coordination with other industries and they may contribute to improve the relative advantage of this city.
Not surprisingly, the industries that deviate from the hierarchy principle tend to reflect strong natural advantages of location. Concentrations of musical instruments manufacturing, for instance, are often tied to the availability of wood resources and dry weather. Similarly, clay refractories manufacturing is tied to ceramics-and pottery-producing districts. Some industries such as aircraft, watches and clocks, and leather-and fur-related manufacturing may also have historical ties to specific locations.
It is also worth investigating the set of industries that exhibit a high degree of consistency with the hierarchy principle. Under the hierarchy principle, each city has the entire set of industries that are present in any smaller city. Thus, if industrial location patterns are highly consistent with the hierarchy principle, then the industries that are present in the majority of smaller cities would also likely be found in a given city. Hence, other things being equal, such industries may be attracted with less effort. This is rather intuitive since industries that are present in smaller cities are the more ubiquitous ones that seek proximity to consumers. They are also likely to be present in larger cities associated with a larger consumer population. Figure 18 shows the distribution of the counts of industries that are absent in a given city but are present in more than 70% of cities smaller than this given city, where the darker colors represent the presence of a larger number of such industries. The names of cities together with the names of absent industries in parentheses are indicated for selected cities.
Although it is beyond the scope of this paper, in order to identify the set of potential industries to be attracted in a given city in practice it is important to control for the location determinants of the potential industries. For instance, while there are a large number of cities that are specialized in seafood products manufacturing, these cities are typically located along the coast. Although seafood products manufacturing is one of the most ubiquitous industries and it is likely to be found in many sufficiently large cities with larger markets, it is unlikely for an inland city to attract this industry.

VI. Concluding Remarks
This paper investigates the evolution of the city system in Japan between 1980 and 2010 with regard to population size and industrial structure. It offers With regard to city-size distributions, the findings of this study are not particularly new. Power law properties at the country level are already widely recognized, together with city-size volatility (see, for example, Batty 2006). In the case of Japan, the development of a nationwide transport infrastructure appeared to have a certain influence, with the growth and decline of cities reflecting differences in the relative advantage in transport access among cities.
A novelty of this study is that it shows the persistent correspondence between the population size and industrial structure of each individual city in the form of the hierarchy principle under the constant churning of industries across cities. This, in turn, implies that the spatial coordination of agglomerations among industries may be playing a key role in the prevailing diversity of city sizes and their power law properties. While the frequent churning of industries among cities and of their establishments has been reported in different contexts (see, for example, Dumais, Ellison, and Glaeser 2002;Duranton 2007), this study is unique in that both the churning and coordination of industries are expressed in terms of the presence of agglomerations of individual industries in each city. Hence, these phenomena can both be considered to be the properties of industrial agglomerations. 20 While the fact that the majority of industries follow the hierarchy principle is an important constraint in designing place-based industrial policies, it can also help identify the most effective industries to promote in each city.
As for the power laws for city-size distributions, which hold together with the hierarchy principle, the central place models (Hsu 2012;Akamatsu, Mori, and Takayama 2016) place the responsibility on the underlying distribution of scale economies across industries. In fact, the distribution of substitution elasticities of imports in the United States are consistent with the power law for city-size distributions (Broda and Weinstein 2006), although the underlying mechanism that results in the observed distribution of substitution elasticities is an open question. 21 The mechanism behind the churning of industries across cities is another open question. The change in the number of agglomerations is significantly influenced by the sensitivity to transport costs, in keeping with the existing theories of agglomeration . Consistent with these theories, Mori, Mun, and Sakaguchi (2017) have shown that industries disperse more (the number of agglomerations increases) if they became more sensitive to transport costs, where the sensitivity to transport costs for each industry is measured by the shipment cost for a unit distance for a unit product value of this industry. They show that between 1995 and 2010, the log of sensitivity to transport costs ranges from −2 to 3, with an average of 0.01 and a standard deviation of 0.79. Thus, while no simple tendency is observed for the importance of transport costs for industries, their finding is consistent with the churning of industries across cities observed in this study. However, the sources of the variation in the sensitivity to transport costs are diverse and include changes in shipment technologies; the increasing dominance of internet communications; and changes in production technologies, exchange rates, and product cycles. Finally, the investigation of the causes of industry churning and the distribution of the prevailing scale economies are left for future research.