Analyzing the Sources of Misallocation in Indian Manufacturing: A Gross-Output Approach

It is well established that misallocation of factor resources lowers productivity. In this paper, I use data from both formal and informal firms to study distortions in input and output markets as sources of misallocation in the Indian manufacturing sector. My work extends the seminal work of Hsieh and Klenow (2009). I consider output, capital, raw material, energy, and service sector distortions in a monopolistically competitive framework to measure the aggregate dispersion in total factor revenue productivity (TFPR). I also decompose the variance in TFPR and show that raw material and output distortions play a major role in defining aggregate misallocation.


I. Introduction
According to the World Bank, the per capita income of the United States (US) was 30 times that of India in 2017. Explaining such differences is one of the fundamental problems in growth economics. Klenow and Rodriguez-Clare (1997) and Hall and Jones (1999) demonstrate that the disparity in total factor productivity (TFP) is the primary source of cross-country income differences. In this context, another debate is about the sources of TFP differences among rich and poor nations. Banerjee and Duflo (2005), Restuccia and Rogerson (2008), and Hsieh and Klenow (2009) argue that in poor countries, some TFP differences are generated from a misallocation of resources across firms. In this paper, I follow the aforementioned notion that resource misallocation is a primary source of variation in TFP. I include intermediate inputs such as raw materials, energy, and services into the model of Hsieh and Klenow (2009) to obtain the extent of misallocation that originates from factor market distortions in a developing country such as India.
When measuring physical TFP, one can adopt either of the two known approaches to measuring a firm's output: value added or gross output. The former excludes intermediate inputs, whereas the latter includes them. The difference between the two measures of TFP is more pronounced at the firm or industry level rather than in aggregate output. Gullickson and Harper (1999); van der Wiel (1999); Hulten, Dean, and Harper (2001); and Cobbold (2003) have demonstrated the benefits of the gross-output approach over the value-added method. The productivity manual published by the Organisation for Economic Co-operation and Development (2001) concludes that the gross-output approach is more appropriate for productivity measurement because it reduces productivity measurement bias. Based on these findings, I extend the Hsieh-Klenow model to measure productivity using the gross-output approach by including raw material, energy, and service sector intermediate inputs as factors of production. 1 The inclusion of these factors separately into the production process enables a more detailed representation of factor misallocation. Furthermore, the decomposition of factor market distortions by considering each factor input distortion separately provides a way to distinguish the level of misallocation in each factor market and to identify the corresponding potential gain from reallocation. I find that distortions in the output market and raw material market explain the lion's share of the variation in productivity.
TFP is a residual in the production process and is not observed directly. Moreover, it is difficult to measure firm-level TFP as the unit of production varies across firms. Therefore, I measure the variation in total factor revenue productivity (TFPR), which by definition is the product of output price and the physical TFP of a firm. In the absence of any factor market misallocation, TFPR should be equal for all firms within an industry. The intuition behind this claim is as follows: if a firm has a high TFP, the marginal cost as well as the output price for that firm will be proportionally lower compared to a low-TFP firm in a particular industry, thus equalizing TFPR. Based on this intuition from Restuccia and Rogerson (2008) and Hsieh and Klenow (2009), I build my empirical results by using data from both formal and informal manufacturing sector firms in India for the survey year 2005-2006. In such a developing country, the informal sector plays an extensive role in shaping the economy. The informal manufacturing sector in India consists of around 17 million firms that provide 82% of total employment in that sector. Hence, it seems appropriate to include informal sector data in the empirical analysis.
My work has the closest resemblance to the paper by Chatterjee (2011). I extend the paper by including service sector inputs and energy inputs in the model separately. There exists a severe distortion in tariff rates in India's energy sector. For example, during 1999-2000, the industrial sector paid a tariff on electricity almost 15 times higher than that paid by the agriculture sector and 2.1 times higher than that paid by the domestic sector (Thakur et al. 2005). Although the Electricity Act of 2003 worked toward the reduction and gradual elimination of cross subsidies, such distortions may have some impact on the cost of energy usage for small and large firms as well as in formal and informal sectors. Besides, small firms often have to use other electricity sources (such as generators), which in turn may impact resource allocation differently in smaller firms as compared to their bigger counterparts. Additionally, liberalization in the service sector in the early 1990s has resulted in significant growth in the sector. According to Chanda and Gupta (2011), service sector reforms along with external market linkages led to substantial growth in the most liberalized service sectors such as business services, banking, insurance, education, medical and health, and others. There is evidence in the literature that can link service sector reform to productivity in the manufacturing sector. For example, Arnold, Javorcik, and Mattoo (2011) demonstrate a positive relationship between service sector reform and the performance of manufacturing firms in the Czech Republic. In India, the cost share of service inputs is around 10% and that of energy is around 7% for the manufacturing sector. Exclusion of these factor inputs might lead to misleading measurements of output and productivity. I also include distortions in the energy and service sectors to verify whether some of the variation in firm-level TFPR is attributed to these factors. I find that there is very little variation in TFPR due to energy input distortions and that misallocation in service inputs is more pronounced in the dispersion of TFPR. On the other hand, I find output and raw material distortions are the primary sources of misallocation in the manufacturing sector. Another interesting result is that the distortions, when taken from several factor markets, together reduce the variation in TFPR. This surprising result will be the subject of further research.
The rest of the paper is organized as follows. Section II discusses the relevant literature. In section III, I present a theoretical model to show how TFPR is affected by firm-level distortions. Section IV describes the data, and section V analyzes the empirical results and the decomposition of the variance of TFPR. Section VI sheds some light on the misallocation among different groups of industries within the manufacturing sector. In section VII, I construe some relationship between firm size and misallocation in factor markets. Finally, I conclude in section VIII.

II. Literature Review
My work is related to a large body of literature that has accumulated over the last few decades. Hsieh and Klenow (2009) argue that in a monopolistically competitive framework, misallocation in factor markets can result in large differences in TFP and in output among firms within an industry. For example, a capital market distortion caused by the disparity in access to cheap credit will result in differences in the marginal product of capital among firms. Hsieh and Klenow argue that in such a situation, the aggregate economy will be better off by allocating more capital to the firm with the higher marginal product of capital. Using firmlevel data from India and the People's Republic of China (PRC), they calculate the TFP gain from reallocating capital, equalizing TFPR within the industry, to be 30%-50% in the PRC and 40%-60% in India. I follow the same intuition in this paper. I include raw materials, energy, and service sector inputs as factors of production and find the effect of distortions in all those inputs on firm-level TFPR. The goal is to find the empirical measurement of distortions in individual factor markets on aggregate TFPR. Restuccia and Rogerson (2008) demonstrate the effect of factor distortion on TFP. They state that different taxes and policies across firms create disparities in prices and lead to a 30%-50% decrease in output and TFP in developing countries. Midrigan and Xu (2014) argue that financial frictions cause variations in TFP across firms through two channels. In particular, financial frictions distort entry decisions and technological adoption of producers. Furthermore, they create disparities in return to capital among producers. Fernald and Neiman (2011) deviate from the standard setup of monopolistic competition. They show that, in a two-sector economy with heterogeneous financial policies and monopoly power, TFP measured in terms of quantities and real factor prices can diverge.
There is a body of literature based on Hsieh and Klenow's framework. Camacho and Conover (2010) use Hsieh and Klenow's methodology to measure productivity differences through misallocation in resources for Colombian industries. Taking the US as the benchmark economy, they find a wide TFPR distribution for Colombia, which implies large resource misallocation across firms. They also calculate that the reallocation of labor and capital among firms will improve aggregate TFP by 47%-55%. Another paper by Kalemli-Ozcan and Sørensen (2014) measures TFP dispersion through capital misallocation for 10 African countries using the World Bank enterprise survey data. They argue that access to finance is one of the main sources of substantial capital misallocation. Dias, Marques, and Richmond (2016) extend the Hsieh-Klenow model to include intermediate inputs and measure TFP disparity using firm-level data from Portugal. They consider data from all sectors of the economy. Consequently, the endogenous intermediate inputs in their model take into account goods produced by all sectors. They find huge misallocation across industries. According to their results, in the absence of misallocation within industries, there would have been a 48%-79% gain in value-added output during 1996-2011. In India, it is rather difficult to find firm-level data for sectors other than manufacturing; hence, I take aggregate input produced by other sectors as exogenously given in the model.
The most closely related work to my research is the paper by Chatterjee (2011), which extends the Hsieh-Klenow framework for both formal and informal manufacturing sectors in India. Chatterjee also includes intermediate input market distortions in the model as a source of variation in TFP. She assumes that the economy has an intermediate input, aggregated from a fraction of total production by each existing firm. The data used in Chatterjee's paper was obtained from the Annual Survey of Industries (ASI) for formal firms and the National Sample Survey Office (NSSO) for informal sector firms, as in the case in my paper.
Since both of these surveys primarily focus on manufacturing sector firms, the aggregated intermediate input produced from these firms will take into account only manufacturing sector products. Consequently, Chatterjee ignores inputs from other sectors such as energy and services in her model. On the other hand, I consider aggregated energy and service inputs as exogenously given in the model apart from the combined raw materials produced by the existing firms. In the next section, I extend Hsieh and Klenow's model to measure the degree of misallocation in the economy.

III. Model
I consider a static one-period model without uncertainty, used by Hsieh and Klenow (2009). I assume that the economy consists of J manufacturing industries indexed as j = 1, 2, … , J. Each industry consists of N j monopolistically competitive firms indexed as i = 1, 2, … , N j . Each firm produces differentiated products and thus has substantial market power. The firms have heterogeneous productivity A ij as exogenously given and an endowment of capital K ij , labor L ij , raw material M ij , energy E ij , and service sector input Z ij . Firms combine the factors to produce a good using a Cobb-Douglas production function. The firm's production function is as follows: where S α S j = 1 and S ∈ {K, L, M, E, Z}.
I consider only manufacturing sector firms in the model because I could find data only for the manufacturing sector in India, which I use for the empirical analysis. For the sake of simplicity, I assume that all raw materials coming from the manufacturing sector are aggregated into a single raw material M, whereas all energy inputs and service sector inputs are aggregated into factor inputs E and Z, respectively. A fraction of the output produced by manufacturing firms considered in the model is aggregated as the manufacturing input M and used by the same firms; hence, the price of M is taken as endogenously determined. On the other hand, since service and energy inputs that are produced by firms in their respective sectors are not considered in the model, I take the output prices of such firms as exogenous. Some manufacturing products may also be used by service and energy sector firms as intermediate inputs; these products are considered as part of consumption goods in the model. I further assume that all firms in an industry have the same cost share of factor inputs α S j , but there is a variation in factor shares between industries.
In this paper, I measure the misallocation in resources that affects firm-level TFPR. Distortion in an input or output market does not always uniformly increase (or decrease) the marginal product of the factors of production (MPF) for all firms. As firms equalize price with the marginal product of factor inputs, a firm that faces taxes will have higher MPF for service inputs than the firms facing subsidies. The intuition behind the entire literature based on Hsieh and Klenow (2009) originates from the hypothesis that aggregate productivity will be higher if factors can be reallocated from lower MPF firms to higher MPF firms.
I assume several types of factor market distortions in the model. Some elements that change the MPF for all inputs by the same proportion are called output distortions (τ Y i j ). Tax on the output of a firm affects all inputs proportionally and can be identified as an example of an output distortion. Moreover, if the distortion creates a discrepancy in only the marginal product of capital, I call it capital distortion (τ K i j ) in accordance with Hsieh and Klenow. Similar remarks hold for raw material distortion (τ M i j ), energy distortion (τ E i j ), and service sector input distortion (τ Z i j ). For example, differentiation in electricity price between small and large businesses is perceived as an energy distortion as it affects only the marginal product of energy. Note that labor distortion is not considered separately but that every other distortion affects the respective MPF, relative to the marginal productivity of labor.
Each firm produces a single good Y i j that is used both as a final consumption good and as an intermediate raw material. C i j and X i j denote final consumption good and intermediate raw material, respectively, which are produced by the i th firm from the j th industry. Firms face a downward sloping demand schedule that resulted from the assumption of a differentiated product environment in a monopolistically competitive market. Hence, the industry's final good appears to be a constant elasticity of substitution aggregation of all firms' final goods represented as where ρ > 1 is the elasticity of substitution. For simplicity, I assume the elasticity of substitution is the same for all industries. This assumption follows from the literature. Each industry's output is sold as consumption good C j and intermediate raw material X j as was the case with firm-level output. I further assume that the markets for consumption goods and raw materials, produced by each industry, is perfectly competitive. Hence, the final consumption good is aggregated from industry-level consumption goods by a Cobb-Douglas production function: The intermediate raw material is produced endogenously by aggregating each industry's production of raw materials, again using a Cobb-Douglas production function: In the above two equations, θ j and λ j are the factor shares of each industry in total consumption and total intermediate raw materials production, respectively. Each firm chooses intermediate raw materials from the aggregated M according to their productivity. The aggregate quantity of other inputs such as energy E and services Z are exogenous in the model. Hence, each firm chooses the optimal amount E i j and Z i j based on its production function. The industry aggregates E j and Z j are given by the sum over each firm's use of energy and services in that industry.
I will now solve the model for optimal factor resources and output by maximizing profit for the firm, industry, and economy. I assume that total factor resources are limited in the manufacturing sector by the aggregate use of factor resources of the firms in the sector.
For each S ∈ {K, L, M, E, Z}, we can write the aggregate factor resources as and solve for the equilibrium to identify the effects of distortion on productivity.

A. Equilibrium Analysis
In this section, I present a comprehensive equilibrium structure for firms, industries, and the economy. The equilibrium consists of the quantities of the consumption good and the intermediate raw materials produced at the level of the firm, industry, and aggregate economy. It also takes into account the optimal amount of capital, labor, raw materials, energy, and services used by each firm. The input markets and final goods markets clear at equilibrium. I now solve the optimization problems for each market.

Final Goods Problem
I assume a representative firm produces a final good Y that is used in consumption C and in raw material M for further production. C is produced using consumption good C j produced by the industries. I assume C to be a numeraire commodity with unit price P. Likewise, P j represents the price of the fraction of output or final good produced by each industry Y j . I do not distinguish between price of final good C j and raw material X j , produced by each industry, on the assumption that both are of the same good and are subject to the same cost and market structure. Hence, the optimization problem for the final consumption good is given by (2)

Intermediate Raw Materials Problem
The fraction of output used as the intermediate raw material (M) is constructed by the representative firms aggregating the produced raw materials (X j ) from each industry. The price of the aggregated intermediate raw material M is given by p m . The representative firm optimizes the production of M as follows: We can solve the final goods problem from equations (1) and (2) and the intermediate raw materials problem from equations (3) and (4) to find the prices set by representative firms. The market clearing price of the final good is and the intermediate raw material's price is The second equality in equation (5) follows from the assumption that C is a numeraire good. Both prices are functions of the industry price (P j ) and the share of each industry in producing the same good (θ j and λ j , respectively).

The Industry's Problem
The final goods produced by each industry Y j are used as both final consumption good C j and intermediate raw material X j . I assume that C j and X j are fractions of the same good, hence they face the same optimization problem. Furthermore, C i j and X i j are fractions of a firm's output Y i j ; therefore, I assume that they are produced using the same production function and that they also incur the same marginal cost. It is safe to assume that the firms charge the same price P i j for both parts of their output. I represent the industry's problem as The market clearing industry price is

The Firm's Problem
To allow for factor misallocation in the input and output markets, I consider several types of distortions. I assume that there exists an output distortion (τ Y i j ) that affects the marginal product of each factor of production by the same proportion. I also consider capital distortion (τ K i j ), raw material distortion (τ M i j ), energy distortion (τ E i j ), and service sector input distortion (τ Z i j ) that affect the marginal product of capital, raw materials, energy, and service inputs, respectively, relative to the marginal product of labor. 2 Each firm solves the following profit maximization problem to choose the optimized level of capital, labor, raw materials, energy, and service inputs: subject to Solving firm i's problem yields Optimal quantities of factor inputs contain both output distortion and distortion in their respective factor markets. By combining equations (12a)-(12e) with the firm's objective function in equation (10), we can find the market clearing price for each firm: Note that the firm-level price in equation (13) comprises the marginal cost of production, markup, distortions, and the reciprocal of firm-level productivity. Given the assumption that firms in an industry have the same factor shares and input costs, I can infer that in the absence of distortions, the price of each firm in an industry would have been inversely proportional to the TFP of the firm. This inference is in line with my conjecture that all firms in an industry will have the same revenue productivity in the absence of any misallocation in factor resources.
I define firm-level total revenue productivity as TFPR i j = P i j A i j . Solving TFPR i j from equation (13) yields Revenue productivity given by equation (14) is a measure of firm-level distortion. Variation in TFPR i j gives us the degree of misallocation in input and output markets. I build my empirical findings on this intuition and try to measure the extent of variation in firm-level revenue productivity in the presence of distortions.
I define the marginal revenue products of factor inputs for an industry as the weighted average of the value of firm-level marginal revenue products, where the weights are calculated as a share of a firm's output in the industry: Recall that S consists of all factor inputs such as K, L, M, E, and Z. P S denotes the corresponding factor prices r, w, p m , p e , and p z , respectively, and τ S i j indicates the corresponding factor distortions relative to labor.
I define industry-level total factor revenue productivity (TFPR j ) to be proportional to the geometric average of the average marginal revenue products of factor inputs in the industry (given in equation [15]): Analyzing the Sources of Misallocation in Indian Manufacturing 145

B. Allocation of Factors in the Industry
I now solve for the allocation of factor resources for each industry. I aggregate factor resources used by all firms in an industry using their marginal products to get the following: Recall that S ∈ {K, L, M, E, Z}. and S = J j=1 S j are aggregate supplies of factor inputs in the economy. Also recall that θ j is the share of each industry in producing the final consumption good. Note that factor accumulations in each industry are affected by factor distortions only through the corresponding marginal revenue products. This result is due to the Cobb-Douglas aggregation at the industry level. Combining industry-level factor inputs (17) and revenue productivity (16), we can derive Combining industry price P j from (9) and firm's price P i j from (13) together with firm-level revenue productivity from (14), we can simplify Equating (18) and (19), we get Hence, the total factor productivity of each firm is a function of the firm-level TFP, TFPR, and industry-level revenue productivity. Now, we can write the final consumption outcome of the economy as follows: and the intermediate good of the economy as Following Hsieh and Klenow (2009), I now assume that TFP (A i j ) and revenue productivity (TFPR i j ) are jointly log-normally distributed to depict the effect of firm-level distortion on the productivity of an industry. By this assumption, the logarithm of industry-level TFP can be expressed as Equation (24) shows that factor distortions reduce overall productivity of an industry through the variance of firm-level TFPR. On the basis of this finding, I will now proceed to show how factor distortions contribute to firm-level TFPR variation. Note that I assume that the number of firms are unaffected by factor market distortions. This assumption is elaborated in more detail in Hsieh and Klenow (2009).

IV. Data
This study uses data on the formal manufacturing sector from the Annual Survey of Industries (ASI) collected by the Central Statistical Office of India. ASI is the primary source of industrial statistics in India, which covers all factories as defined in the Factories Act of 1948. ASI data is an annual survey of formal manufacturing firms with more than 50 workers and a random one-third sample survey of firms with more than 10 workers (with electricity) or firms with more than 20 workers (without electricity). I use the 62nd round of ASI data collected in the survey year 2005-2006. I also take into account data for the unorganized manufacturing sector collected by the National Sample Survey Office (NSSO) of India for the survey year 2005-2006. The NSSO collects firm-level data for the informal manufacturing sector in India every 5 years. The dataset includes small manufacturing firms along with some service sector firms and some unincorporated proprietary firms. These firms are not registered under the Factories Act of 1948; hence, they are not included in the ASI data. The data for the informal sector consists of a large number of firms that use one or two workers. These firms have missing values for most of the variables I take into consideration. Also, they contribute a very small percentage of total value added. Table 1 summarizes the distribution of informal firms and the corresponding cumulative percentages of contributions to total value added, according to the number of employees. There are over 30,000 one-employee firms that contribute only 1.6% of total value added and almost none of them have data for labor and capital in the corresponding dataset. In my analysis, I do not include such firms. I only consider informal firms that have at least six employees, a cutoff set on the basis of these firms' substantial market share. To keep the two datasets comparable,  I only consider manufacturing industries that are covered in both the ASI and NSSO datasets. The literature in the field (La Porta and Shleifer 2008) argues that informal sector firms are small and highly unproductive compared to formal sector firms. The assumption of monopolistic competition among firms in the model allows for firms with different levels of productivity to coexist in the market. Table 2 shows the distribution of firms in the analysis. There are around 31,000 formal sector firms taken from the ASI data, whereas the number of informal sector firms from the NSSO data is around 5,000. For this analysis, I had to drop some observations from both sectors due to missing data. Formal firms consist of all sizes, while informal firms are mostly small. To simplify the analysis, I use 2-digit industry-level data developed in the National Industrial Classification (NIC) system. I consider 23 different industries, including food and beverage, machinery and equipment, wood, paper, publishing, computing machinery, and others (see Table 3).
Hsieh and Klenow (2009) use the value-added method to measure productivity and distortion in capital and output. They did not incorporate raw materials, services, or energy inputs in the production function. I first replicate their results using the value-added method and then extend the model to incorporate intermediate inputs as factors of production. This extension will adopt the gross-output method. I use nominal revenue of the firm as the output variable.
Aside from firm revenue, the variables that I use for this analysis are the firm's industry (2-digit NIC), labor compensation, net book value of fixed capital stock, rent on capital, intermediate input costs, and fuel and energy costs. I assume that service input cost is the same as the residual cost. I use labor compensation including wages, bonuses, and benefits as proxy for labor input. Capital is measured by the average of net book value of capital at the beginning and end of the year. I deviate from Hsieh and Klenow (2009) and Chatterjee (2011), as well as other previous works based on the measurement of the rental cost of capital. Existing literature in this field uses an exogenous percentage of capital as rental cost, whereas I measure the same by variables such as rent on machinery, building, and land, interest paid on loans, and other miscellaneous capital cost, which are taken from the ASI data for the formal sector.
However, for informal firms, the NSSO data do not explicitly provide rent on capital. I measured rental cost from the residual of value added after subtracting total labor cost. The costs of raw materials and energy are calculated explicitly from the cost of inputs of production. Service input costs consist of transport and communication costs, insurance charges, license costs, and other operative expenses.
The elasticity of substitution (ρ) is assumed to be constant in the model. Based on the literature in this field, I assume the value of ρ to be 3. In most of my empirical analysis, I use factor shares from industries in the US as a benchmark to identify the effect of distortion on productivity. The factor shares data are from the KLEMS measures found in the National Income and Product Accounts industry database (2005) provided by the US Bureau of Labor Statistics.

V. Empirical Analysis
My identification strategy is similar to that of Hsieh and Klenow (2009) and Chatterjee (2011). I established identification of distortions based on the rationale that, in the absence of distortions, revenue factor shares of output will be proportional to the parameters α K j , α L j , α M j , α E j , and α Z j in a market with monopolistic competition. Because I assume distortions in factor markets, the   revenue shares will give a biased estimation of the parameters. We can validate this from the first-order conditions of the firms: Recall that S consists of all factor inputs and τ S i j denotes corresponding distortions relative to the labor market. In the presence of distortions, I cannot distinguish the misallocation in resources from the bias in the parameters. Following Hsieh and Klenow (2009), I take into account US factor shares. The strategy is based on the assumption that US factor markets are less distorted than in India and the technology used in the industries is the same for both countries. A more detailed discussion on the assumptions are presented in Chatterjee (2011). Factor shares for both countries, described in Table 3, represent the average of the cost share for each factor in each industry. Figure 1 illustrates the bias in factor shares in Indian industries with respect to benchmark US industries. Any deviation from the 45-degree line shows misallocation in the corresponding factor markets in India. I find a similar pattern in capital, labor, and raw material shares presented in Chatterjee (2011). It is evident from the figure that cost shares of capital, labor, and service inputs are significantly higher in the US than in India, whereas shares of raw materials and energy are higher in India. Next, I analyze within-industry variation in average revenue product of labor. Figure 2 illustrates the distribution of the logarithm of firm-level average revenue product of labor (APRL) relative to the industry mean, log (APRL i j /APRL j ). I trim 1 percentile from both ends to avoid outliers. The horizontal axis shows log (APRL i j /APRL j ), whereas the vertical axis measures the density of firms. There is a substantial variation in average revenue product of labor within an industry with a variance of 3.76.

A. Value-Added versus Gross-Output Approach
The goal in this section is to measure the variation in firm-level TFPR as an indicator of misallocation in factor markets. The variable of interest is the logarithm of firm-level TFPR relative to the industry TFPR, log (TFPR i j /TFPR j ). I depict both value-added and gross-output approaches to measure TFPR. First, I replicate the results from Hsieh and Klenow (2009) using the value-added approach. They estimate the distribution of TFPR using formal manufacturing sector data for  1987-1988 and 1994-1995. I repeat their method using 2005-2006 data for both formal and informal sectors. I also illustrate the TFPR distribution using the gross-output method using the same data. Cobbold (2003) presented the formal relationship between value-added and gross-output TFP as where G and VA represent nominal values of total revenue and total value added, respectively. Several studies, such as Oulton and O'Mahony (1994) and van der Wiel (1999), show that productivity growth measured using value added is much higher than the measurement considering all inputs. It naturally follows from the above equation that given G and VA, TFP as well as TFPR measured using the value-added approach will be larger than if measured by the gross-output approach.
Before calculating the variance, I trim 1% tails of log (TFPR i j /TFPR j ) to get rid of outliers. Figure 3 plots the distributions of the logarithm of TFPR relative to the industry mean. The dashed line shows the value-added TFPR distribution whereas the solid line shows the distribution using the gross-output approach. The variation in value-added TFPR is much higher than the variation in gross-output TFPR. Table 4 presents the TFPR dispersion statistics in firm-level TFPR. Standard deviation (SD) in value-added TFPR is around 0.99 compared to 0.47 using the gross-output approach. The difference in both methods is more pronounced when the variation in TFPR is estimated at higher percentiles. Table 5 shows the dispersion in the logarithm of TFPR in Hsieh and Klenow (2009) using the value-added approach and the same variable in Chatterjee (2011) using the gross-output approach. The results display a larger value-added SD than   Notes: Column 2 shows dispersion of total factor revenue productivity estimated by Hsieh and Klenow (2009) for 1994-1995 data using the value-added approach. Column 3 depicts the same variable estimated by Chatterjee (2011) for 2004-2005 data using the gross-output approach. Sources: Hsieh and Klenow (2009) and Chatterjee (2011). in Hsieh and Klenow (2009), who use the same approach with formal sector data from 1994 to 1995. This might be the consequence of an increase in overall level of misallocation in the last decade or inclusion of the informal sector in my analysis. Furthermore, I find comparable results (shown in Table 4) with those of Chatterjee (2011) in the dispersion of gross-output TFPR. After including energy and service sector distortions, the SD in firm-level TFPR in my study dropped by 0.02 from an overall 0.49 as shown by Chatterjee using 2004Chatterjee using -2005 data. The gap between the results is more conspicuous in the 75th to 25th percentiles and 90th to 10th percentiles.

B. Decomposing the Misallocation in Factor Markets
I now turn to separating the effect of each component attributed to the variance of firm-level TFPR. Moving forward, only the gross-output approach will be considered. I take into account several types of distortions in input and output markets. The calculation for each type as a function of total revenue, cost of inputs, and factor shares is derived from first-order conditions of a firm as where all input market distortions are measured relative to the labor market. The intuition behind equations (26b)-(26e) is that, in the presence of distortions, input costs relative to labor compensation will be lower than given by the output elasticity. Equation (26a) demonstrates that a deviation of labor share from output elasticity with respect to labor will result in an output distortion.
To give a more elaborate presentation of the above result, I now find the variance of log (TFPR i j /TFPR j ). The total misallocation is measured by the following variance:  where Recall that S consists of all factor inputs such as K, L, M, E, and Z. α S j denotes the corresponding factor shares, and τ S i j indicates the corresponding factor distortions. Also recall that I measure factor distortions relative to the labor market, implying τ L i j to be 0. In the above equations, D S can be inferred as components of each factor input in the variance of TFPR. Table 6 describes the variance and covariances of each of the above components.
The variance of the components of equation (16) depict the contribution of factor distortions in explaining the variation in firm-level TFPR. Since I measure distortions in factor markets relative to the labor market in my analysis, the variance of D L measures the variation in industry TFPR in the presence of only output distortions, multiplied by the cost share of labor. Moreover, the variance of D Y determines the variation in firm TFPR attributed to only output distortion. Dispersions in D Y and D M are very high compared to the overall variance of log(TFPR i j /TFPR j ), implying that misallocation is highest in output and raw materials.
Overall variance in log (TFPR i j /TFPR j ) includes the pairwise covariance between the components of equation (16) as well. It is interesting to note that the covariance between output and raw material distortions is the highest (0.5716). This result may follow from the fact that in my framework, raw materials are endogenous, thus the output of one firm is used as raw materials in another.
Next, I examine the distinct effect of each distortion on the logarithm of TFPR relative to the industry mean. In Figure 4, the solid lines illustrate the distribution of the variable of interest, taking one factor distortion at a time. The dashed line represents the actual firm-level TFPR distribution taking all distortions together. The top panels of Figure 4 show TFPR distributions taking either output or capital distortion. Similarly, the middle panels and bottom panel depict scenarios with only raw material, energy, or service input distortions, respectively.
In the absence of any distortion, I expect the TFPR of all firms to equalize, which should reflect in a distribution that shows a vertical line in a graph centered at 0. Any deviation from such a line shows signs of distortion. Taking one factor distortion at a time facilitates the comparison between the contribution of each factor input distortion toward the overall distortion in the market.
Since higher dispersion in the distribution shows higher distortion, it is perceptible from Figure 4 that output and raw material distortions play the main role in the overall distortion within an industry. Capital and service input distortions contribute a modest share in the measurement of misallocation. Energy distortion is almost negligible. These results emphasize the findings in Table 6. 5 The intriguing observation from Figure 4 is that the misallocation in TFPR is lower when all factor market distortions are considered than when considering only output distortion or only raw material distortion. Such findings imply that factor input distortions offset each other's effects in describing total misallocation. This is an area I would like to work on in the future in order to understand the underlying intuition.

C. Formal versus Informal Sectors
Since I use data from both formal and informal sector firms in my empirical analysis, it is important to examine if there is any inherent difference in the pattern of input and output distortions between these two sectors. Figure 5 illustrates the distribution of TFPR relative to the industry mean for firms in the formal and informal sectors. The solid line represents formal sector firms, whereas the dotted line represents firms in the informal sector. The dispersions in the distributions are similar in both sectors while the mean of the distribution is higher in the formal sector than in the informal sector. Next, I examine the distribution of factor distortions separately for both sectors. Figure 6 shows the distribution of output and input distortions for formal and informal firms. The distributions show that in the informal sector, output, raw material, energy, and service input distortions are a little higher than in the formal sector. Capital distortion on the other hand is more dispersed and higher in the formal sector.

VI. Misallocation within the Manufacturing Sector
Misallocation may vary between industries within the manufacturing sector. Therefore, it is useful to look further into the distribution of TFPR in separate groups of industries to infer any inherent pattern of misallocation that may exist within the manufacturing sector.

A. According to Use of Service Inputs
Services include a vast range of inputs used by manufacturing sector firms. Also, manufacturing industries vary widely in their use of such inputs. Using manufacturing firm data from the Czech Republic, Arnold, Javorcik, and Mattoo (2011) show that the productivity of manufacturing industries that rely extensively on service inputs is affected more by reforms in the service sector. To investigate if such a connection exists in Indian manufacturing, I use the 2003-2004 input-output table provided by the Ministry of Statistics and Programme Implementation of the Government of India to rank the industries according to their use of services (Government of India, Ministry of Statistics and Programme Implementation 2008). Five industries that used more than 55% of the total service inputs used by the manufacturing sector are food and beverage, basic metals, chemical products, wearing apparel, and electric machinery. Figure 7 shows the TFPR distribution of these five industries compared to others. The dispersion in TFPR is lower in the industries that use service inputs more intensively, reflecting a lower misallocation in these industries.

B. According to Raw Materials Contribution
Variance decomposition of the misallocation in factor markets in section V.B. revealed that not only does raw material distortion play a vital role in explaining overall misallocation in TFPR, it is also highly correlated with output distortion. Since raw materials in the model are produced by the manufacturing firms themselves, raw material distortion may partly reflect output distortion in the industries supplying raw materials. To investigate the source of such a distortion, I categorize industries according to their raw materials contribution, using the input-output table mentioned previously. Five manufacturing sector industries that contributed almost 63% of raw materials to the same sector are food and beverage, textile, petroleum products, basic metals, and chemical and chemical products. Next, I compare the output distortion in industries that contribute a large share of raw materials with the raw material distortion for all manufacturing sector firms. The solid line in Figure 8 shows the TFPR distribution with only output distortion using firms in the top five raw materials contributing industries, while the dashed line shows the TFPR distribution with only raw material distortion using all firms in the manufacturing sector. The former distribution is more skewed than the latter, which suggests that raw material distortion in the manufacturing sector may be partly reflecting output distortions in the industries contributing most to manufacturing raw materials. This finding suggests that policies that can reduce output distortion in industries that supply the lion's share of manufacturing raw materials should result in lower raw material distortion in the overall manufacturing sector.

VII. Misallocation and Firm Size
There is a body of literature on the sources of factor distortions. Banerjee and Duflo (2005) discovered that capital market distortions might be originating from disparities in credit policy. Chatterjee (2011) mentions unavailability of raw materials as a reason behind intermediate input distortions. Bhidé (2008) shows that in a developing country such as India, electricity connection from private and public enterprises might cause a distortion in energy prices. Hsieh and Klenow (2009) argue that government policy, especially size restrictions, might prevent firms from achieving an optimal scale, thereby creating an output distortion. They also considered firm size as an explanation for TFPR dispersion within an industry. Ha, Kiyota, and Yamanouchi (2016) show a nonlinear relationship between firm employment size and factor market distortions in the context of manufacturing firms in Viet Nam. I now proceed to examine the relationship between firm size and distortion in factor markets. Table 7 presents regression coefficients from estimating this relationship. I use the logarithm of total labor employed as a measure of firm size. Column (1) uses the logarithm of firm-level output distortion as the dependent variable. Similarly, the dependent variables for columns (2), (3), (4), and (5) are the logarithms of firm-level capital, raw material, energy, and service input distortions, respectively. I control for industry fixed effects, ownership type (private, central government owned, state government owned, etc.), type of organization (individual proprietorship, partnership, co-operative society, etc.), and location of the firms.
I find a positive relationship between firm size and each type of distortion. 6 Smaller firms in the formal or informal sector might be able to avoid some policy restrictions, unlike their larger counterparts. An assumption of monopolistic competition includes the provision of a markup in the model. Though I assume all firms in an industry have the same markup, larger firms might have greater market power and larger markups, which in turn will create more output distortion as well as raw material distortion. It will be fascinating to see the effect of firm size on distortion once we relax the assumption of a constant elasticity of substitution within an industry.

VIII. Conclusion
I measure the aggregate misallocation in resources using firm-level data from both formal and informal manufacturing sectors in India for the survey year 2005-2006. I include energy distortion and service input distortion to extend existing research such as those by Hsieh and Klenow (2009) and Chatterjee (2011). The dispersion in TFPR within each industry is substantial, implying misallocation caused by distortion of factor resources. While energy distortion does not contribute much to aggregate misallocation, the effect of service sector input distortion is more pronounced. I further decompose the variance of TFPR to find the effect of each factor market distortion separately. I discover that output distortion and raw material distortion contribute the largest share in aggregate misallocation. Reallocation of such factors within industries should result in the largest gain in TFP. Moreover, I find a high level of covariance between output and raw material distortion which, along with a further exploration within the manufacturing sector, suggests that some of the distortion in raw materials may reflect the output distortion in industries producing a larger share of the raw materials. I also uncover a puzzling result that the inclusion of many factor distortions together offset each other's effects and results in a lower aggregate misallocation. Although unexpected, this result may inspire further research in this field.