Spatial Price Differences and Inequality in the People's Republic of China: Housing Market Evidence

The large literature on regional inequality in the People's Republic of China (PRC) is hampered by incomplete evidence on price dispersion across space, making it hard to distinguish real and nominal inequality. The two main methods used to calculate spatial deflators have been to price a national basket of goods and services across different regions in the country or else to estimate a food Engel curve and define the deflator as that needed for nominally similar households to have the same food budget shares in all regions. Neither approach is convincing with the data available. Moreover, a focus on tradable goods such as food may be misplaced because of the emerging literature on the rapid convergence of traded goods prices within the PRC that contrasts with earlier claims of fragmented internal markets. In a setting where traded goods prices converge rapidly, the main source of price dispersion across space should come from nontraded items, and especially from housing given the fixity of land. In this paper we use newly available data on dwelling sales in urban PRC to develop spatially-disaggregated indices of house prices which are then used as spatial deflators for both provinces and core urban districts. These new deflators complement existing approaches that have relied more on traded goods prices and are used to re-examine the evidence on the level of regional inequality. Around one-quarter of the apparent spatial inequality disappears once account is taken of cost-of-living differences.


I. Introduction
The large literature on regional inequality in the People's Republic of China (PRC) is hampered by the limited evidence on price dispersion across space, which makes it difficult to distinguish real inequality from nominal inequality. Like statistical agencies in most countries, the PRC's National Bureau of Statistics (NBS) does not publish a spatial price index that allows cost-of-living comparisons over space. Instead, the focus is on the temporal consumer price index (CPI), which is reported at both the national and the provincial level. There are also separate CPIs for rural and urban areas at both national and provincial levels. These indices allow rates of change in the consumer price level to be compared across different locations but do not allow comparisons of absolute price levels or of the cost of living between locations.
However, there are good reasons to suspect that price levels and the cost of living vary over space. A higher price level is expected in more productive, richer economies (Balassa 1964, Samuelson 1964. The same pattern likely holds within countries because typically productivity growth is stronger in the traded sector than in the nontraded sector. If wages in the traded sector rise with productivity while nontraded sector wages are pegged to those in the traded sector (both sectors compete for workers in the same labor market), then prices of nontraded items will grow faster than productivity and will rise in real terms. The overall price level is an average of traded and nontraded prices so that in the context of regions of the PRC, one can expect a higher overall price level in export-oriented, coastal provinces in which nominal income is higher, such as Guangdong, than in poorer, inland provinces such as Yunnan.
The implications of this pattern are worth emphasizing in the PRC where there is substantial debate about the impacts of economic reform on inequality. A common claim in the literature is that spatial inequality rose in the reform era, especially when policy neglected the rural sector (Fan, Kanbur, and Zhang 2011). This claim has fueled initiatives to help seemingly laggard regions catch up to seemingly advanced regions, including the West China Development Project (Lai 2002), the Northeast China Revitalization Campaign (Zhang 2008), and the Rise of Central China Plan (Lai 2007). Just a subset of these initiatives saw more than one trillion yuan ($180 billion) of state-led infrastructural investment directed to western regions of the country (Yao 2009). But without reliable measures of spatial price differences, it is not clear how much of the reported spatial inequality (and its claimed increase) is simply due to regional price variation and how much reflects differences in real incomes.
In this paper, we use newly available data on dwelling sales in urban PRC to develop spatially-disaggregated indices of house prices which are used as spatial deflators for provinces, urban prefectures, and urban core districts. Since we account for only one source of cost-of-living variation over space, the impacts on inequality that we find when using these deflators should be considered a conservative, lower bound. Our approach contrasts with the two main methods previously used to calculate spatial deflators in the PRC where either a national basket of goods and services has been priced in different regions or a food Engel curve has been estimated and a deflator derived as that which is needed for nominally similar households to have the same food budget shares in all regions. Neither approach is convincing with the data available in the PRC, as we explain below. Moreover, a focus on traded goods such as food may be misplaced because of the emerging literature on the rapid convergence of prices within the PRC that contrasts with earlier claims of fragmented internal markets.
It is increasingly reasonable to expect integrated goods markets in the PRC, and for goods prices to obey the law of one price (net of transport costs), but the same is not true of housing services. Because of the fixity of land supply, accounting for regional differences in housing service prices is fundamental to the calculation of spatial differences in the cost of living. While other services are also considered nontradable, the long-run supply of their dominant factor of production can spatially adjust to reduce interregional price differences. For example, if haircuts are relatively more expensive in urban areas of the Pearl River Delta, hairdressers might be expected to migrate to that region to increase the supply and reduce the regional price premium. There is no similar migration possibility for land-the presence of abundant land (relative to the population) in western regions and consequently relatively low house prices can do nothing to moderate the high cost of housing in Beijing.
Our focus on housing costs as the main driver of spatial cost-of-living differences is supported by previous studies in other countries. According to Moulton (1995, p. 181): "the cost of shelter is the single most important component of inter-area differences in the cost-of-living." Similarly, Massari, Pittau, and Zelli (2010) find housing prices account for almost 70% of cost-of-living differences between northern and southern Italy. Our approach is perhaps most closely related to Jolliffe (2006) who examines how adjusting for cost-of-living differences between metropolitan and nonmetropolitan areas in the United States (US) causes a complete reversal of the poverty ranking of these areas. In order to measure poverty using spatially deflated data, Jolliffe (2006) uses the fair market rent (FMR) index which consists of just two components: housing expenses (with a weight of 0.44) and all other goods and services (weight of 0.56). This index assumes that cost-of-living variation over space reflects variation in housing prices only and that there is no variation over space in the prices of all other goods and services. 1 Although our results are most clearly relevant to scholars interested in the PRC, they also may have broader applicability. A growing international literature examines the impact of accounting for spatial price differences, especially those generated from urban housing markets, on apparent trends in nominal outcomes. For example, Moretti (2013) finds a more rapid rise in the cost of living experienced by college graduates compared to high school graduates accounts for one-quarter of the 1980-2000 increase in the nominal college premium in the US. This cost differential occurs because college graduates have increasingly congregated in urban areas with expensive housing (using monthly rent as a proxy for the user cost of housing). 2 Similarly, Albouy (2012) shows how accounting for the higher real cost of living in US urban areas with more expensive housing provides revealed-preference estimates of the quality of life that are more consistent with popular "livability" rankings and stated preferences. The same effect is present in the Russian Federation, where Berger, Blomquist, and Peter (2008) estimate housing value and wage equations to impute implicit prices for city amenities. In their study, house values and nominal incomes are correlated over space and the implied quality of life rankings generated by the housing and labor markets are consistent with observed internal migration flows. The importance of housing markets for measured inequality and quality of life is emphasized by this literature.
The remainder of the paper is structured as follows. Section II reviews the literature in three areas that help to inform this study: spatial deflation studies, market integration studies, and housing market studies. Section III describes the data that we use to create housing-related spatial deflators for the PRC. One concern with using dwelling prices as an indicator of cost-of-living differences is that dwelling quality may vary systematically across space, so to address this issue we discuss, in section IV, the nature of real estate development in the PRC and provide some empirical evidence on the importance of location effects relative to dwelling characteristics in determining housing prices. Another concern is that dwelling prices may capture more than just the costs of shelter, hence we also contrast our approach with studies that rely on rental costs and describe recent trends in tenure patterns in urban areas of the PRC. The calculation of the deflators is described in section V and the results are contrasted with other spatial deflators for the PRC. The impact of using the deflators when measuring spatial inequality is discussed in section VI, while the conclusions are discussed in section VII.

II. Previous Literature
The approach we use here, of constructing spatially real income by deflating only for housing costs, relies on literature for the PRC that is drawn from three distinct areas: spatial deflation studies, market integration studies, and housing market studies. Our overall goal is to contribute to the literature on spatial inequality in the PRC by examining the impact of using various deflators on estimates of spatial inequality. We reviewed the spatial inequality literature in a recent study (Li and Gibson 2013), where the focus was on the misunderstanding that results from ignoring the fact that for most of the reform era, statistical authorities in the PRC denominated local GDP by the number of people with hukou household registration from each place rather than the number of people actually residing in each place (so that measured inequality mechanically increased as the number of non-hukou migrants rose). In the current study we use the adjustments to the population denominators created by Li and Gibson (2013) but otherwise do not address population issues and instead pay attention to the impact of adjusting for spatial cost-of-living differences.

A. Spatial Deflation Studies
The most widely used spatial deflators for the PRC appear to be those of Brandt and Holz (2006). 3 The authors use provincial price data from 1990 to calculate the cost of national rural and urban expenditure baskets (containing 40-60 items) and a population-weighted combined basket. The prices had originally been collected by statistical authorities for the purpose of calculating a temporal index (the CPI) for each province, so that they do not necessarily refer to the same quality of items across provinces. Rural prices were not available for all products consumed in rural areas, so provincial capital city prices were instead used for items constituting just over 40% of the average rural household budget. Since there were no prices for nontraded services, average labor wages in township and village enterprises (TVE) were used as a proxy. Finally, the analysis lacked data on either rent, land prices, or real estate prices, therefore construction costs per square meter of rural household buildings were used in their place with the "quantity" of housing services in the basket set at 0.5625 square meters (m 2 )-chosen to give an expenditure that was equivalent to nationwide per-capita rural household living expenditures on housing. Brandt and Holz (2006) then use the annual rate of change in the CPI for each province to extend the 1990 spatial deflators for each province back to 1984 and forward to 2004. This time series is also used by other researchers studying inequality (e.g., by Sicular et al. 2007, Li andGibson 2013) since it allows easy updating by just using published data on the annual rate of change in each province's CPI. Despite the simplicity, there are potential problems in using a temporal index to update a spatial index so as to create a panel of deflators. An example of such bias comes from the Russian Federation: Gluschenko (2006) compares a spatial price index calculated for period t using spatial prices for the same period, with an index for period t that is extrapolated from a spatial price index for period t 0 using local CPIs to update prices from t 0 to t. The direct method gives a spatial price index for each province whose range is 44% of the national mean price level, but the indirect method gives a much wider range, of 72%.
The example from the Russian Federation shows that CPI-updated price levels may not adequately proxy for cross-spatial price levels. More generally, it may not be possible to construct panel price indexes that are unbiased across both space and time (Hill 2004). The problem is that bilateral index formulas such as for the Laspeyres index used by Brandt and Holz (2006) are unlikely to give transitive results when extended to a multilateral situation. For example, consider a price index calculated for three regions (Beijing, P B ; other urban areas, P U ; and rural areas, P R ) with base weights that differ in each region. A direct comparison between the rural price level in period t 2 and Beijing prices in period t 0 will not give the same result as constructing an indirect comparison via the third region in an intermediate time period, t 1 . That is, Instead, transitivity requires use of a multilateral index method, such as the Geary-Khamis (GK) method that underlies the Penn World Table or EKS (Eltetö, Köves, and Szulc) type methods. 4 Another issue with the deflator formed by Brandt and Holz is the use of a national basket rather than letting consumer responses to relative prices and other differences induce regional variation in the structure of consumption. While sensitivity to consumer responses is a claimed feature of the "no-price" Engel curve method described below (Gong and Meng 2008), it is not required that methods using disaggregated price data ignore variation in the structure of consumption. For example, Deaton and Dupriez (2011) use unit values from household surveys to calculate spatial price differences in two other large countries-Brazil and India-using multilateral Törnqvist indexes that are the geometric average of price relativities between each region and the base region, weighted by the arithmetic average of the budget shares for the two regions. Hence, variation in the structure of consumption, as captured in budget shares for each region, is accounted for by this type of spatial price index. The results for these two countries show a 20% range in average food prices between the cheapest and most expensive regions in India, while in Brazil there is almost no price gradient, reflecting the higher incomes in Brazil and hence greater importance of processed foods which likely have much smaller price margins between regions than do unprocessed foods. 5 Gong and Meng (2008) use an Engel curve approach to estimate spatial price deflators for each province using data from the Urban Household Income and Expenditure Survey from 1986 to 2001. 6 Their deflator is defined by what is needed for nominally similar households to have the same food budget shares in all regions 4 These methods compare each country (or region) with an artificially constructed average country (or region). Typically they use the Paasche price index formula to make each of these bilateral comparisons with the artificial country as the base and tend to suffer from substitution bias because the price vector of the base artificial country (region) is not equally representative of the prices faced by all of the countries (regions) in the comparison. EKS methods impose transitivity in the following way: first, they make bilateral comparisons between all possible pairs of countries and then take the nth root of the product of all possible Fisher indices between n countries. Deaton and Dupriez (2011, p. 4) note that multilateral price indexes required for spatial work are typically not consistent with the inflation rates in local CPIs and so need to be calculated regularly, not just once, and updated by the local CPIs. 5 Relatedly, supermarkets are more important in Brazil (and also in the PRC) than in India, and the growth in the importance of supermarkets assists with spatial convergence in food prices (Reardon et al. 2003). 6 In contrast to the later work of Almås and Johnsen (2012), Gong and Meng (2008) do not create a panel price index of time-space deflators, and instead the food Engel curves are estimated separately for each year.
following an idea first proposed by Hamilton (2001) for measuring bias in a temporal CPI. These authors find implied regional cost-of-living differences from the Engel curve that are considerably larger than those calculated from pricing a fixed basket using either provincial average prices or household-level unit values. The difference from fixed basket results was most apparent during the mid-to late-1990s when social welfare reforms altered coverage and subsidies for public health, education, and housing. In terms of inequality, when no adjustment was made for spatial price differences, Gong and Meng (2008) find that the mid-1990s saw the most significant increase in regional income inequality, but after using the deflator derived from their Engel curve results, they find regional income inequality to actually increase the most in the late 1980s.
Almås and Johnsen (2012) use a similar Engel curve approach with data from just 2 years (1995 and 2002) for rural areas in 19 (of 31) provinces and urban areas in 11 provinces. Rather than estimating a spatial cost-of-living index year by year, they attempt to make incomes comparable over both time and space using a single set of Engel curve estimates. Based on this procedure, these authors claim that the CPI understates price changes in rural areas and overstates them in urban areas: the deflator derived from the Engel curve suggests a 44% rise in the rural cost of living from 1995 to 2002 and zero change in the urban cost of living compared to CPI increases of 8% and 11%, respectively. The use of this Engel curve deflator closes the rural-urban gap in terms of price levels, with the rural cost of living rising from 60% of the urban level in 1995 to 87% of the urban level by 2002. Thus, the real income figures calculated with their deflator show a greater rise in inequality and a more modest fall in poverty than is implied by making no spatial adjustment and using the CPI for temporal deflation.
The studies that use a food Engel curve to back out regional differences in the cost of living (or more generally the bias in any spatial or temporal deflator) are one strand in a broad literature that relies on observable proxies for well-being to calculate implicit compensation for people living in different circumstances (such as family size and structure, or location). For example, Timmins (2006) uses internal migration data from Brazil under the logic that moves reveal preferences over locations that differ in terms of nominal incomes and the cost of living and can thereby reveal spatial differences in the cost of living. Lanjouw and Ravallion (1995) use child anthropometric indicators (stunting and wasting) in addition to food share to indicate well-being when anchoring their calculation of allowances for household size economies (effectively, the inverse of the compensation needed by people living in smaller households to be as well off as those in larger ones at the same per capita consumption). Subjective data on self-rated welfare can also be used. Krueger and Siskind (1998) and Gibson, Stillman, and Le (2008) use survey questions that compare feelings of being better-off in the present or the past to adjust for possible biases in the CPI, and the same method could be used to make spatial comparisons.
The problem with all of these approaches is that it is simply an assertion that the welfare indicator-whether food budget shares, anthropometrics, and so forth-does indeed identify people who are equally well off. At least since Nicholson (1976), a long literature has argued that food share is not a good indicator of wellbeing. Consider the example of using food share to calculate the exact amount of money needed for parents to maintain their consumption while providing for a child: Since child consumption is concentrated more on food than is adult consumption, the food share would be higher even if exact compensation had been given, and this higher food share would wrongly indicate the need for further (over)compensation.
In the context of the food Engel curve estimates for the PRC, there is a substantial difference between provinces and between urban and rural areas in the proportion of household members who are children. The data from the latest wave of the China Health and Nutrition Survey (CHNS) show 0-15 year old children comprise just 3% of the average household in urban areas of Liaoning province but comprise 16% of the average rural household in Guangxi. Food shares will thus be higher in Guangxi even if there were no differences in the cost of living, but the Engel method will not necessarily recognize this. 7 Consequently there are reasons to doubt the reliability of spatial deflators produced by this method.

B. Market Integration Studies
Many authors consider the PRC an example of a developing country with segmented markets and much less integration than developed countries (Gong andMeng 2008, Xu 2002). In the early reform period, this description may have been apt since economic interaction between provinces had been minimized during the planned economy era, making the PRC more like a cluster of independent economies rather than a large, spatially integrated economy.
But the surprising claim of some influential studies is that market integration declined even more during the reform period. According to Young (2000Young ( , p. 1128 (T)wenty years of economic reform . . . resulted in a fragmented internal market with fiefdoms controlled by local officials whose economic and political ties to protected industry resemble those of the Latin American economies of past decades.
The claimed reason for the seemingly perverse fragmentation of the internal market while the PRC opened up internationally is that devolution of powers saw 7 Adding demographic variables to the Engel curve regression may not help since there is no reason for these effects to operate as just intercept-shifters. The literature using food Engel curves to study bias in temporal deflators is more credible since it typically restricts attention to a particular household type (say, two adults with two children). The change in household structure over a decade or so is much less than the differences over space, yet all of the regional differences are rolled into a catch-all term that is assumed to be due to just cost-of-living differences. local government revenue linked to local industry protection, leading to interregional trade wars. Apparent confirmation comes from Poncet (2005) who examined "border effects" between provinces by comparing volumes of intraprovincial and interprovincial trade. The trade-reducing impact of provincial borders appeared to increase between 1992 and 1997, from which the study concluded that the domestic economy was fragmented and that "rather than a single market, (the PRC) appears as a collection of separate regional economies protected by barriers" (Poncet 2005, p. 426).
A critical reappraisal shows that the evidence from Young (2000) is not robust and that the PRC is comparable to the US in terms of being a relatively integrated, large economy (Holz 2009). For example, Young showed a rise in the (natural logarithm of the) interprovincial standard deviation of (the natural log of) prices of various consumer and agricultural goods, which was taken as evidence of trade barriers segmenting markets. But this calculation was neither robust to inflation nor to the growth in product variety in the reform period. Once Holz (2009) accounts for these factors there is no trend in interprovincial price dispersion, and the range of variation matches that in intercity data for products in the US. Similarly, Young found a convergence in the output structure of each province during the reform period, taken as evidence of provinces duplicating each other's industries rather than allowing regional specialization. The degree of convergence in the composition of value-added across US states in the same period was approximately the same as for provinces in the PRC, but there were no claims of rising interstate trade barriers in the US at that time.
In keeping with the reappraisal by Holz (2009), a number of more recent studies find the PRC to be a relatively well integrated market economy. Fan and Wei (2006) apply panel unit root tests to data on monthly prices for a group of 93 industrial products, agricultural goods, other consumer goods, and services in 36 major PRC cities, finding that prices do converge to the law of one price. Similarly, Ma, Oxley, and Gibson (2009) use spot energy prices in 35 major cities to test for convergence with their panel unit root tests indicating that the energy market is integrated in the PRC. Huang, Rozelle, and Chang (2004) examine prices for rice, maize, and soybeans from almost 50 locations in 15 provinces on the eve of the PRC's accession to the World Trade Organization. These authors find most market pairs to be integrated (and this integration to extend down to village level) and market integration to be substantially higher than even 5 years earlier. 8 A longer term perspective on grain prices found that on the eve of the industrial revolution, market integration in the PRC was as high as it was in most of the advanced areas of western Europe (Keller and Shiue 2007a), while contemporary markets are even more integrated. Keller and Shiue (2007b, p. 107) conclude that for the PRC "in the late twentieth century local and national prices essentially move one-to-one." Thus, it is mainly the central planning era that deviated from the pattern of the PRC being a normal, relatively integrated, large economy.
Another way to examine market integration is to test how long it takes prices to converge following idiosyncratic shocks. For example, Parsley and Wei (1996) find convergence rates to purchasing power parity of 5 quarters for tradable goods and 15 quarters for services, for a sample of 48 cities in the US. When the comparable approach is used in the PRC, convergence rates appear to be much faster. Lan and Sylwester (2010) study the prices of 44 products in 36 PRC cities and estimate the half-life of divergences from the law of one price averages just 2.4 months. This is approximately twice the speed of adjustment found in the US, leading these authors to conclude: "(O)ur findings suggest that prices within [the People's Republic of] China converge to relative parity extremely quickly" (Lan and Sylwester 2010, p. 231).
A recent review of product, labor, and capital market integration in the PRC summarizes the evidence as showing: "(P)roduct markets became more integrated over time, as regional trade increased and product prices were increasingly similar throughout the country" (Chen et al. 2011, p. 73). Given this similarity over space of the prices of tradable goods, the focus of many of the previous spatial deflation studies summarized above may be misplaced. In an environment where traded goods prices converge rapidly, the main source of price dispersion across space should come from the nontraded components of consumption and especially from housing, given the fixity of land. We therefore briefly review the literature on spatial variation in house prices before turning to the data that we use to develop housing-related deflators.

C. Housing Market Studies
In the planned economy era, government agencies such as work units provided all urban housing. Rents were low and the dwelling one was allocated depended on administrative criteria such as job rank (Bian et al. 1997). Housing reform was launched in 1988 with privatization and creation of an urban housing market as the aim (State Council 1988). Thereafter, commodity houses built by private developers could be bought on the housing market (Huang and Clark 2002). For the first decade of reform, a dual track system developed with large numbers of commodity houses bought by work units and then distributed to workers at discounted prices (Huang 2003). In 1998, the State Council abolished the old housing system completely, and thereafter any provision of subsidized housing by work units was strictly banned (State Council 1998, Huang 2003. Since then, the urban housing system has become totally market oriented. In contrast to the urban sector, rural houses were self-funded, self-built, and self-renovated by residents, and remain so until now (Liu 2010). The right to use rural residential land (nongcun zhaijidi shiyongquan) is evenly distributed and free of charge for village collective members. Land is collectively owned by the village and the occupant is not allowed to mortgage or trade the land, although transfers within the village collective community are permitted. The occupant may build new houses or renovate old houses with their own funds for all kinds of needs such as marriage, tourism (nongjiale, akin to a motel, for urban tourists to taste rural life), family workshop, and handicraft production (Liu 2010). Thus, the rural housing system enables rural residents to satisfy their housing needs at much lower cost than is incurred by urban residents in the current era. Though rural self-built houses are generally large and cheap, they are poor in quality relative to urban housing in terms of housing attributes such as the energy source for cooking, bath facilities, and individual toilets (Logan, Fang, and Zhang 2009).
The reforms have led to a large literature on urban housing in the PRC, with early studies on determinants of home ownership (Huang 2003, Pan 2004. But after the full marketization of urban housing in 1998, the focus shifted to affordability due to the sharp increases in house prices. For example, the Shanghai Housing Price Index (SHHPI) of the China Real Estate Index System (CREIS) rose by 63% within 2 years from January 2001 (Hui and Shen 2006). Liu, Reed, and Wu (2008) document poor housing affordability in Beijing during the 2000s using the house price to income ratio (PIR) and the home affordability index (HAI). The PIR is defined as the ratio of the average market value of a typical dwelling to the average annual household income and the HAI measures the ability of a household with an average income to pay back a mortgage on a typical home. In a more comprehensive study, Xiang and Long (2007) calculate PIR and HAI indices for 34 major cities and find Beijing, Shanghai, Shenyang, Xiamen, and Haikou to have poor housing affordability, while the inland cities of Hohhot, Changsha, Chongqing, and Urumqi have relatively better housing affordability.
In addition to affordability, the other focus of recent literature on the urban housing market is price determination. Zhang and Tian (2010) study sales of new dwellings in 35 major cities between 1995 and 2006, finding stable long-run intercity price relativities, which implies that the urban housing market in the PRC is segmented and that specific local economic characteristics matter. Deng, Gyourko, and Wu (2012) examine land auctions for 35 major cities from 2003 to 2011 to construct a model of land supply and also for use in a hedonic model of dwelling prices, finding that house prices are driven by the land market rather than by construction costs. Zheng, Kahn, and Liu (2009) estimate a hedonic house price regression for 35 major cities and find significant location effects in determining prices. Wu, Deng, and Liu (2012) use a similar model but examine the role of intracity locational factors (e.g., distance to city center). Overall, this research indicates the importance of location in determining dwelling prices in urban PRC, with the most plausible source of inter-area variation coming from land prices.

III. Data
For our main analysis, we use administrative data on the average selling price for new residential dwellings that real estate developers are required to report to the NBS. Specifically, every transaction for new housing sales is meant to be reported (both monthly and annually, directly to the NBS through an electronic portal). These are the most commonly used data for studies of the PRC urban housing market (Zheng, Kahn, and Liu 2009). Since most of the housing market is new construction rather than repeat sales (Deng, Gyourko, and Wu 2012), an index derived from prices of new units is broadly representative. The average selling price is given for each province in the China Real Estate Statistics Yearbook (NBS 2011a), while for urban prefectures the statistics are found in the China Statistical Yearbook for Regional Economy (NBS 2011c). For urban core districts (which are more consistently urban than the prefecture they belong to), the numbers are reported for 2009 (but not 2010) in the China Urban Life and Price Yearbook (NBS 2010). 9 We obtain data on average GDP for every province, every urban prefecture, and every urban core district from the China Statistical Yearbook for Regional Economy (NBS 2011c) and the China City Statistical Yearbook (NBS 2011b). These same two sources provide information on the value of total urban real estate investments on residential assets (IRA). The data on the resident population, which are needed for correct calculation of per capita values (rather than using the misleading regis- In addition to these data provided by the NBS, we gathered our own data on sales prices and attributes of new apartment units from www.Soufun.com, which is the largest real estate listing site in the PRC. In conjunction with the CREIS, Soufun.com co-publish the China Real Estate Statistical Yearbook. For the primary data collection, we only considered the dominant type of urban residence which is a private apartment in a complex. We did not consider subsidized public rental housing, economically affordable housing, and high-grade apartments and villas, which are just minor components of the urban housing system. According to the China Real Estate Yearbook 2011, of 8.82 million new urban housing units sold in 2010, just 2.5% were high-grade apartments or villas and 3.7% were economically affordable housing. The other 94% were standard private apartments, and so our primary data collection concentrated on this dominant form of urban housing.

IV. The PRC Urban Housing Market and Price Determinants
If dwelling quality varies systematically over space, then it may interfere with using published average new dwelling selling prices as an indicator of standardized housing costs for urban areas. However, real estate development in the PRC is organized such that systematic quality differences between cities are unlikely, since many apartment complexes in different cities are developed by the same nationallevel real estate development companies (sometimes even using the same names for their complexes in each city). While each complex may have dozens of multistory towers, each containing more than 50 individual housing units, within a complex there are only a few (typically less than 10) floor plans available and the selling price in terms of yuan per square meter varies little across the individual units. But there is considerable variation in selling price between complexes in different areas, including between different districts of the same city. For example, Beijing has 16 city districts, and complexes in different Beijing districts may have prices that vary by up to CNY30,000 ($4,800) per square meter. This variation is consistent with the finding of Deng, Gyourko, and Wu (2012) that variation in new dwelling prices is driven by the land market.
In order to verify if dwelling quality varies systematically over space, we gathered data in February 2013 on sales prices for 150 new apartments in three cities. Each city is from a different level of the administrative hierarchy: (i) Beijing is a municipality-level city with an equivalent status to a province; (ii) Nanjing is the capital of Jiangsu province and is one of 15 subprovincial cities, which have much greater autonomy and higher status than prefecture-level cities; while (iii) Changsha is a prefecture-level city and the capital of Hunan province. The data collection was restricted to these three cities because advertisements from most of the 323 cities in Soufun.com lack data on key attributes (both unit and complex characteristics). The majority of advertisements list only the average selling price of all units in a complex, but for the three selected cities, the unique price (per square meter) for every apartment in a complex is consistently listed. Furthermore, the advertisements always list the complex opening date, completion date, and the proportion of units sold to date (the sales ratio) only for these three cities, while for other cities these data are missing. Previous research has found that these factors play a significant role in determining new apartment prices because they represent changing pricing behavior of the real estate developer at different stages to completion of an apartment complex (Wu, Deng, and Liu 2012). We sampled prices from 3 to 5 complexes for each of the 13 districts of Beijing, 5 to 8 complexes from each of the nine districts of Nanjing, and 5 to 12 complexes from each of the five districts of Changsha.
The data used for the hedonic apartment price regression are described in Appendix A. For some characteristics, apartments in Nanjing and Changsha appear to have more desirable qualities than those in Beijing, with more green space and a higher proportion of the complex area being green space (despite the complexes in Nanjing and Changsha rising higher, on average, than those in Beijing). Also, the listings for Changsha are for slightly newer complexes than for Beijing, as seen from the fewer months elapsed since the complex was opened for sale and the greater number of months to completion of the complex. On the other hand, the apartments in Beijing in the sample are larger than those in Changsha, which is likely to be a desirable characteristic showing up in higher prices even when we concentrate on the price per square meter. The apartment complexes from Beijing also have a higher car park ratio (the number of car parks per dwelling)-note that these are rented or sold separately, while most observations for Nanjing and Changsha leave this attribute blank so it is unclear if car parking is bundled with the price of the apartment in those cities. Overall, there is no clear sign that Beijing apartments have better quality relative to those in the other two cities. For example, the new trend in the real estate market in urban PRC of developers selling decorated new houses rather than unfinished ones is just as apparent in all three cities.
The results of the hedonic house price regressions are shown in Table 1. The dependent variable is the logarithm of the price (in thousands of yuan) per square meter so that the relative difference in prices is not directly shown by the regression coefficients on the dummy variables for each city. Instead, the coefficients must be transformed into percentage differences using percentage difference = (eβ − 1) × 100, which shows that the price per square meter is 84% higher in Nanjing than in Changsha, and 256% higher in Beijing without controlling for any attributes of the apartment (first column of Table 1). The results in the second column of the table use the attributes of each apartment but do not consider the location. Despite having 15 characteristics that are potentially related to selling prices, these explain slightly less of the variation in prices than just using location dummy variables.
When the apartment characteristics are put together, the hedonic regression explains 84% of price variation, and after controlling for all of the characteristics of the particular apartment and its complex, the relative price differences are fairly similar to what they were without the controls. Specifically, the (conditional) price per square meter is 105% higher in Nanjing than in Changsha and 229% higher in Beijing. While the price premium is slightly smaller for Beijing than when using the raw data, it is somewhat larger for Nanjing and this reflects the fact that, at least for these three cities, there is no systematic quality gradient whereby apartments in cities with higher priced real estate have more desirable attributes of either the unit or the apartment complex. In the absence of the sort of apartment-specific data that we used in the regression, we proceed to use raw data on average selling prices for all cities and we treat the spatial variation in these raw prices as mainly reflecting the fixity of land supply rather than systematic variation in dwelling quality. (11.04) * * * (5.48) * * * R-squared 0.61 0.59 0.84 * = significant at 10%, * * = significant at 5%, * * * = significant at 1%, m2 = square meter. Note: Absolute value of t statistics in parentheses for regressions where N = 150. The omitted location is Changsha. Source: Authors' computations from data in housing sample collected by authors in February 2013 from www.Soufun.com.

A. Rental Equivalence Approach
Before turning to the evidence on average selling prices of new dwellings, we discuss an alternative approach to forming standardized housing costs-the rental equivalence method. In some Organisation for Economic Co-operation and Development (OECD) countries, temporal price indices for the services provided by owner-occupied dwellings are based on the imputed value of shelter for owners that are calculated as equivalent to what they forgo by not renting out their homes. In the case of the CPI for the US, this measure was adopted in 1983 in place of the previous measure based on house prices, since it was argued that prices did not accurately reflect the costs of shelter since they also include the use of a house as an asset. There is no guarantee that the rental equivalence method produces lower costs than do price methods, and indeed in the US between 1983 and 2007, the monthly principal and interest payment needed to purchase a median-priced existing home increased by only one-half as much as the increase in shelter prices indicated by the rental equivalence method. 10 But to maintain consistency with temporal deflators, many spatial cost-of-living studies in the US rely on rents rather than on house sales prices (e.g., Moretti 2013).
Despite the arguments for the rental equivalence approach, three reasons lie behind our decision to use the selling prices of new dwellings. First, we note that in some OECD countries (e.g., New Zealand) the price of housing services for owner-occupiers in the CPI is based on new housing sales, with the value of the net increase in the stock of owner-occupied housing during the reference period providing the expenditure weights. These components reflect the change in the price of housing acquired by the owner-occupier segment of the household sector, which is analogous to the approach that we use below and is particularly applicable to the situation in urban areas of the PRC since so much of the market is supplied by new housing rather than resale of existing dwellings. 11 Second, observed rents in urban areas of the PRC may not be an appropriate basis for pricing the rental equivalence of owner-occupied dwellings because of the low share of rented dwellings (Ahmad 2008). Finally, in contrast to the situation for prices of new dwellings, there are no comprehensive statistics on rents reported by the NBS on a spatially disaggregated basis so that any attempt to implement the rental equivalence approach would be limited in scope and so could not inform national-level estimates of inequality. Moreover, when considering variation in house prices and rents over space, the same fundamental driver-land prices-affects both, whereas for the temporal variation studied by much of the literature, factors such as interest rates may create a wedge between house prices and rents.  Note: "Highly urbanized" refers to counties or districts where more than 70%, but less than 90%, of the population are urban residents. "Very highly urbanized" refers to counties or districts where 90% or more of the population are urban residents. There are 583 counties or districts in the 2000 census in these categories and 596 in the 2010 census. The column total number of households includes "other tenure types" which are not reported in the

B. Trends in Urban Tenure
To help put our choice of using selling prices rather than rents in context, we describe here the trends in urban tenure based on data from the "long form" population census (answered by 10% of the population, which we gross up to total population counts). The available data are reported at county or district level so we categorize according to the urban population as a percentage of the county or district population and restrict attention to the most urbanized counties and districts (being 70% or more urban), distinguishing "highly urbanized" with 70%-90% urban from "very highly urbanized" with ≥ 90% urbanized. In total, there were 71 million urban households in 2000 and 103 million in 2010 under these definitions.
The first trend is that the share of urban households living in self-built accommodation has fallen considerably, from over one-quarter of the total in 2000 to just one-sixth by 2010 (Table 2). This trend is most apparent in very highly urbanized areas, where households in self-built dwellings declined by 3 million over 10 years and are now under 10% of the total (down from 19% in 2000). This pattern most likely stems from rising land values-for example, Wu, Gyourko, and Deng (2012) calculate that real, constant quality land values in Beijing rose by 800% from 2003 to 2010. Under such land price pressures, self-built dwellings are likely to be undercapitalized in the sense of being too small and having inadequate facilities relative to a new dwelling that would be appropriate for such land values. The flip side of the falling share of self-built dwellings is a rising share of purchased dwellings, which are the majority form of tenure (Table 2). Moreover, the rate of new construction, of approximately 8 million new standard private apartments each year, is equivalent to about one-sixth of the existing stock of purchased urban dwellings. The preponderance of new stock in the owner-occupied portfolio means that our focus on the price of new apartments is appropriate.
The final tenure category in Table 2 is renters, who have also seen a rise in numbers, although only half as large as the increase in the number of purchasers. However, what is not shown in Table 2 is that that rental sector in urban areas of the PRC is quite different from the owner-occupied sector, mainly housing poor rural-urban migrant workers (Wu 2012) and youth (Zhu 2013, Ouyang, 2011 in dwellings that are older and of lower quality than the dwellings that are being purchased. For example, we gathered data on apartment rentals in Beijing from Soufun.com and found the listed dwellings to have an average age of 10 years, which is much older than the dwellings that were for sale.

V. Housing Prices and Estimated Deflators
Since there is tentative evidence that purchased new apartment quality does not vary systematically between cities, we go ahead and use data from the China Real Estate Statistics Yearbook (NBS 2011a), China Statistical Yearbook for Regional Economy (NBS 2011c), and the China Urban Life and Price Yearbook (NBS 2010) on the average selling price in 2010 (provinces and urban prefectures) and 2009 (urban core districts) of new residential dwellings. We note that these data are for the urban sector, and our expectation is that these prices vary over space most especially because of intercity land price variation. For this reason we do not consider rural housing since rural residential land use rights are not determined by market forces and also because the data available for rural households are just the construction costs (building materials) which we consider to be traded goods and therefore less likely to vary over space than do urban house prices. The distinction between the urban and rural housing sectors is clearly seen in the way that the statistical system reports the relevant data-rural household expenditure on new dwelling construction is defined as consumption expenditure in the China Rural Statistical Yearbook 2011 (NBS 2011d) while urban household expenditure on house purchases is defined as a separate category apart from consumption expenditure in the China Urban Life andPrice Yearbook 2010 (NBS 2010).
The average prices for new urban housing in 2010 are displayed in Figure 1, at provincial scale. The highest prices are found in Beijing (CNY17,150 per square meter) and Shanghai (CNY14,290 per square meter). The next highest category of prices (CNY7,001-CNY9,400 per square meter) are only one-half as expensive as those in Beijing, and are found in Tianjin, Zhejiang, Guangdong, and Hainan. In general, the highest prices are found in a continuous belt of provinces along the coast between Jiangsu and Hainan and in the Gulf of Bohai. All of the remaining provinces fall into the lowest price category, which includes all interior provinces plus the coastal province of Shandong.
There is considerable heterogeneity within provinces since many of them are as large and as populous as independent countries. Therefore, Figure 2 provides a finer-scale view of urban house prices, reporting the average value in 2009 for each of the 288 core urban districts. These core districts lie within prefecture-level and subprovincial cities, but are more consistently urban than the full area of the prefecture, which often includes rural counties. In order to concentrate on the region where most core urban districts are located, the map truncates Xinjiang, Tibet, and Qinghai in western PRC. This region contains only two urban districts-Karamay and Urumqi (both in Xinjiang). It is apparent that there are a number of cities in interior provinces such as Chengdu, Harbin, Ji'nan, Taiyuan, and Wuhan with much higher prices than revealed by the provincial average. Pu'er in Yunnan even falls into the highest price category shared by cities such as Guangzhou, Hangzhou, and Shenzhen, in addition to Beijing and Shanghai. Conversely, it is also apparent that there are cities in the coastal provinces with much lower prices than some cities in the interior. Consequently, the variation in the cost of living will be more accurately portrayed at subprovincial levels.
In order to measure cost-of-living differences over space, we calculate a Törnqvist price index for each province (and also for each urban prefecture and urban core district): where s ij is the average share that item j has in consumption in region i, and s kj is the average budget share in region k, which is the base region, while P ij and P kj are the prices of item j in region i and in the base region. The Törnqvist index uses the arithmetic average of the budget shares in the base region and in region i to weight the logarithm of the price relativities between those two regions. These weighted price relativities are then summed over all J items that comprise the budget. Our working assumption is that only house price variation contributes to cost-of-living differences, so as to form a lower bound for the impact of deflation on spatial inequality. Since it is assumed that prices do not vary spatially for all other components of the budget, the index formula reduces to the log house price relativity between Beijing (base region) and region i, weighted by the average importance of housing in Beijing and region i. There are no micro data on household budget shares on housing that can be disaggregated to subprovincial levels so we instead use national and regional accounts data. Spatially disaggregated annual investments in urban residential assets are published by the NBS, and since the urban housing market is dominated by new housing stock rather than repeat sales (Deng, Gyourko, and Wu 2012), this annual investment should be a good proxy for the component of regional income set aside for housing provision. However, one further adjustment is needed because of the famously low share of final consumption in GDP for the PRC, which varies across provinces because of differing intensities of net exports.  Brandt and Holz (2006)  We therefore use the ratio of annual investments in urban residential assets to final consumption expenditure as our proxy for the budget shares in the Törnqvist formula. 12 Table 3 contains the provincial Törnqvist indexes calculated under these assumptions along with the input data used. The base region is Beijing and the index values are interpreted as the factor by which nominal GDP per capita in region i has to be multiplied to translate it into Beijing prices. On average, GDP per capita in provinces outside of Beijing has to be raised by 30% to make it comparable to GDP per capita at Beijing prices. The deflator ranges from 1.03 for Shanghai-whose residents face housing prices almost as high as in Beijing-to 1.42 for Chongqing and 1.43 for Liaoning. It is notable that the lowest average housing prices do not always give the lowest calculated price index because the importance of housing also matters. For example, house prices are low in Gansu but the inflation factor is lower than average because of the relatively low importance of provision for residential housing in regional income.
The last column of Table 3 reports the deflator from Brandt and Holz (2006) using the national basket, which is updated to 2010 using movements in each province's CPI. The Brandt and Holz deflator is more variable than the Törnqvist index, with an unweighted coefficient of variation across provinces more than onethird higher than for the Törnqvist index. This pattern is consistent with Gluschenko (2006), who found that calculating a spatial deflator just once and updating it with the local CPIs can overstate the spatial variation in prices. Nevertheless, the overall level of adjustment needed to put GDP outside of Beijing into Beijing prices is quite similar, with an average inflation factor of 32%. The cross-province patterns of the deflators are also quite similar, with a Pearson correlation coefficient of 0.71 and a rank-correlation of 0.63.

VI. Impacts of Deflation on Spatial Inequality
Our overall goal in carrying out the analysis reported here is to examine how much difference is made to estimates of spatial inequality in the PRC when using deflators derived just from variation in housing costs. The results are summarized in Table 4, which reports three measures of inequality-the Gini coefficient, the Theil index, and the weighted coefficient of variation (CoV)-for three levels of geography (province, urban prefecture, and the urban core districts within urban prefectures). 13 The nominal values that are deflated are GDP per resident in 2010, which takes into account the various corrections to both GDP statistics and population denominators that are summarized in Li and Gibson (2013). We restrict attention to 2010 because of the need for census data to provide correct counts of the resident population (rather than the hukou-registered population) for subprovincial spatial units.
If no account is taken of spatial variation in the cost of living, the level of spatial inequality is overstated by up to 35% (for interprovincial analysis, using the 13 The Theil index is: T w = m j=1 ( p j /P)(y wj /μ) ln(y wj /μ) where m = 31 provinces (or 288 prefectures or urban core districts), p j is the population of the j th province (or prefecture or district), P is overall population, y wj is GDP per capita of the j th province (or prefecture or district), and μ is the overall population-weighted mean of GDP per capita for all provinces. The (weighted) coefficient of variation is: CoV = m j=1 ( p j /P)(y wj − μ) 2 /μ. The Gini coefficient is: G = m i=1 m j=1 p i p j y wi − y wj /2 p 2 i μ. CoV = population weighted coefficient of variation, (D) = inequality measure on GDP per resident with spatial housing cost deflation, GDP = gross domestic product, GINI = Gini coefficient, THEIL = Theil index. Note: Results are for 31 provinces, 288 prefectures, and 288 prefecture-merged districts (prefecture urban cores). Source: Authors' computations from data in NBS (2010;2011a, b, and c;. Theil index). This is two-thirds larger than the impact of spatial deflation found by Li and Gibson (2013) who use the deflator from Brandt and Holz (2006), updated to 2010 with the rise in each province's CPI. Since the current analysis assumes that prices for all goods other than housing are set on perfectly integrated markets, it should provide a lower bound to the impact of spatial deflation if a "full" deflator was used which considered all components of consumption.
The lowest proportionate overstatement from not deflating comes when studying urban prefectures. This most likely reflects the fact that these spatial units have the highest apparent level of inequality amongst the various levels of disaggregation presented in Table 4, due to their heterogeneity. An urban prefecture may contain rural counties and this lack of a consistently defined urbanity gives higher apparent inequality between these "urban" units, and so correcting for spatial price differences has less impact. The more defensible level of subprovincial analysis is the urban core district within an urban prefecture, since this excludes rural counties. At this level of geography, spatial inequality is overstated by 14% (using the weighted coefficient of variation) to 30% (using the Theil index) if differences in the urban cost of living are not taken into account.

VII. Conclusions
In this paper we use newly available data on dwelling sales in urban PRC to develop spatially-disaggregated indices of house prices which are used as spatial deflators for provinces, urban prefectures, and urban core districts. Since we account for only one source of cost-of-living variation over space, the impacts on inequality that we find when using these deflators should be considered a conservative, lower bound. Previous approaches to forming spatial deflators for the PRC have focused more on traded goods prices, but our interpretation of the recent evidence is that traded goods prices adjust quickly to parity levels and so are unlikely to cause longrun cost-of-living differences between areas. In contrast, the fixity of land makes housing the most likely source of price dispersion across space.
It would be ideal to generate regional components of house prices that hedonically adjust for all components of dwelling quality, but such data are not available beyond a limited number of cities. Nevertheless, our limited analysis suggests that systematic variation in the quality of new dwellings between cities is unlikely, making the published data on the average price of newly constructed urban dwellings a potentially useful source of information on spatial cost-of-living differences. When we use this information to adjust nominal GDP per resident we find that around one-quarter of the apparent spatial inequality disappears once account is taken of cost-of-living differences. Since there are good theoretical reasons for expecting a higher price level in nominally richer areas, our results provide a caveat to concerns about the degree of spatial inequality experienced in the PRC.
Our results are consistent with literature from other countries which finds that apparent patterns in nominal outcomes may weaken or reverse once account is taken of spatial price differences emanating from urban housing markets. The current research may help compare the spatially deflated level of real inequality in the PRC to that in other countries, but we believe that any altered inferences due to the deflation we propose are most relevant to temporal comparisons. The legacy of central planning and the hukou registration system meant that urbanization and urban housing development in the PRC were much less advanced at the beginning of the reform era than would be expected. Consequently, the spatial cost-of-living differentials now being caused by the urban housing market (reflecting the fixity of land) are likely to have grown from a very low base, making interpretation of trends in nominal inequality in the PRC atypically sensitive to assumptions about spatial and temporal differences in the cost of living.