An Empirical Estimation of Asia's Untapped Regional Integration Potential Using Data Envelopment Analysis

This paper uses bilateral flow data on multiple dimensions of economic integration to construct a composite index of regional integration outcomes covering 19 regions in various parts of the world. As a first step, the multidimensional indicator is used to rank regions according to their current degree of regional integration, which allows for a direct comparison of Asia's regional integration performance with those of other regions of the world. As a second step, the constructed indicator of regional integration outcomes is used as the output variable in a data envelopment analysis to estimate Asia's untapped regional integration potential.


INTRODUCTION
Regional integration is at the center of the current debate on strategies for optimal growth and development policies. Many authors have stressed the role of regional integration in achieving economies of scale, improving market structures, and enhancing the forces of competition. This would drive technological change and foster higher productivity growth and investment activities, which are often viewed as eventually leading to higher benefits from trade and positive welfare gains (Krugman 1991a, Baldwin and Venables 1995, Fernandez and Portes 1998, Sapir 2011. Regional integration is also frequently seen as a possible "building block" for greater trade liberalization and multilateralism (Bhagwati 1993;Baldwin 2006;Calvo-Pardo, Freund, and Ornelas 2011). In addition, there may be important noneconomic benefits of regional integration that go beyond raising national income levels and reducing poverty (Bhattacharyay, Kawai, and Nag 2012).
While the academic debate on regionalism has also produced various studies arguing against regional integration efforts (e.g., compared with multilateral trade liberalization within the World Trade Organization: see, for example , Krugman 1991b;Frankel, Stein, and Wei 1995), the empirical evidence over the past 2 decades shows that the focus of trade policy has shifted toward regional approaches. The rising prominence and increasing number of preferential trade agreements (PTAs) and regional trade agreements (RTAs) are evidence of this (WTO 2011).
Compared to the sometimes euphoric perceptions of policy makers, the empirical evidence on regional integration outcomes is, however, rather limited. As De Lombaerde et al. (2008) argue, there is a need for quantitative measures and empirically verifiable analyses of regional integration outcomes, which this paper seeks to address. Most of the existing studies on regional integration can be classified into two groups. The first comprises papers that discuss regional integration at an institutional level, looking for example at subregional organizations or multilateral free trade agreements (often referring to the stages of integration defined by Balassa 1961). Most of these studies focus on theoretical considerations and are based on qualitative arguments.
In the second group are studies that investigate effective degrees of economic integration using empirical data. Most of these papers investigate only a single dimension of integration, such as the large literature on trade, or studies on migration. This paper follows the empirical approach of this latter class of studies, but combines data on multiple areas of integration into a single indicator. This allows for an estimation of realized degrees of regional integration along various dimensions and enables the results for Asia's regions to be compared with those from other regions of the world. A constructed composite index of regional integration outcomes is then used as the output variable in a data envelopment analysis to estimate Asia's untapped regional integration potential. The results suggest heterogeneous, but on average large, possible increases in regional integration levels across Asia, given the current status of institutional conditions and available resources.
The remainder of the paper is structured as follows. Section II introduces the data sources and explains the applied methods for the construction of a composite regional integration (CRI) index and the performance of a data envelopment analysis (DEA). The results are presented in Section III and tested for their robustness in Section IV. Section V concludes.

II. DATA AND APPLIED METHODS
In order to estimate regional integration potential in Asia, as a first step a composite index of regional integration outcomes based on empirical data for various areas of economic integration is constructed. 1 Any such composite index depends on the data used and the chosen aggregation methods. Although several authors have recently proposed procedures to construct such an index, no standard procedure has been established in the literature so far (see also De Lombaerde et al. 2008). The methods applied in this paper are specifically designed to capture integration outcomes along multiple distinct dimensions in a coherent and transparent way and to aggregate the data to ensure comparability of variables with different scales and units of measurement. Alternative measurement and weighting schemes are discussed as part of the robustness checks in Section IV.

A. Aggregation and Normalization
Following Nardo et al. (2008), a first step in constructing composite indexes is to select a set of empirically quantifiable variables that serve as proxies for the multiple dimensions of regional economic integration outcomes being considered. In order to keep the data comparable, intraregional shares of directed flow variables are used as a single measure for all dimensions of economic integration. 2 Based on a bilateral data matrix containing information about directed flows between economies and , the intraregional share is defined as the fraction of flows between the economies in region (denoted ) and total flows between those economies in and all economies in the world ( ), which can be calculated as: Because of the limited availability of global bilateral datasets, the selection of variables to be included in the composite index is restricted. However, for a number of relevant dimensions of economic integration such data do exist, including: (i) cross-border mobility for migration and tourism, (ii) trade and investment, and (iii) monetary and financial integration. 3 For each of these areas, data on several variables are available that cover most economies in recent years. Most of the variables used in this analysis come from the International Monetary Fund (IMF) and World Bank datasets, as well as from the Asian Development Bank integration indicators database (see Table 1 for a complete list of data sources).
The composite index of regional integration outcomes is based on each region's performance along the considered variables and constructed as shown in Figure 1. At each aggregation level, equal weights are assigned to the respective subindicators (see Section IV for a discussion of different weighting schemes) and all variables are normalized such that higher values indicate a higher degree of regional integration. The range of possible values is between 0 and 1 for all variables. For indicator this is achieved by calculating the distance to the sample maximum, setting the normalized value for region equal to: * ∈ (2) 1 The term "region" is used in this paper to refer to a set of (mostly bordering) economies located in the same geographical area. 2 Other possible measures of regional integration outcomes include intraregional correlation coefficients and intensity indices. 3 For other areas, such as regional public goods, no adequate datasets could be identified.
For all variables that are measured according to a predefined scale (e.g., the Logistics Performance Index [LPI]) the distance to the theoretically maximal attainable value is used (5 in case of the LPI). 4

B. Economy Groupings and Missing Data
The sample consists of 19 regions, comprising a total of 186 economies (see Table A.1 in the Appendix for the groupings). For some of the variables used, data on additional economies not listed in the Appendix are available. These economies are included in the calculation of total flows between individual regions and the world ( ) as part of .
For all variables, data on some economies are missing and hence the affected regions are only a subset of the corresponding economies (see Table A.2 in the appendix for numbers of available economies). The average coverage across all variables is about 80% of each region's economies and with the exception of the IMF's Coordinated Portfolio Investment Survey dataset, the coverage is never below 50% for any variable and region. For the two variables on monetary and financial integration (cross-border bond and equity holdings), data are available for only 40% of the economies (e.g., many of the island states in the Pacific and Caribbean are missing) and only two African economies are included (Egypt and South Africa).   Overall LPI score, based on: efficiency of customs clearance process, quality of trade and transport infrastructure, ease of arranging shipments, quality of logistics services, ability to track and trace consignments, timeliness of shipments (1 = low to 5 = high), and export/import conditions measured as the distance to the "frontier," representing the best performance observed on the topics: documents (number), time (days), and cost ($ per container) associated with exporting/importing a standardized cargo by sea (0 = lowest to 100 = highest performance) Logistics Performance Index (2014*) and Doing Business Database (2015), World Bank

Business regulation environment
Overall distance to the "frontier," representing the best performance observed on the topics: starting a business, dealing with construction permits, getting electricity, registering property, getting credit, protecting minority investors, paying taxes, enforcing contracts, and resolving insolvency (0 = lowest to 100 = highest performance) Doing Business Database (2015), World Bank FDI = foreign direct investment, LPI = logistics performance index. * For economies for which data for 2014 are missing, the next available year has been used. Source: Author.
In order to correct for the bias that would occur for the African regions if this subindicator was simply excluded from the computation of the respective composite regional integration (CRI) index for these regions, an attempt was made to impute the missing values for these cases. This was done by using the average of the available two economies for the three African regions that do not have any observations. Although this procedure represents only a very rough approximation, it is likely to significantly reduce the bias that would otherwise occur. 5 When the CRI index is computed without taking into account monetary and financial integration at all, the resulting ranking differs only slightly and none of the imputed regions is severely affected (Western Africa is placed two ranks higher, Eastern and Middle Africa remain the same). This indicates that the imputed values are not driving the results for these regions.

C.
Global Comparison of Composite Regional Integration Levels The resulting values for the CRI index are shown in Table 2, along with normalized intraregional shares for the three areas considered: economic integration (columns 2-4), and input-related variables (columns 5-7 When looking at simple averages over continents, Europe clearly has the highest result. Asia achieves a value only slightly below the average of the four American regions, while Africa lags behind. All continents are characterized by considerable heterogeneous regional integration levels. In addition to the results based on the constructed CRI index, integration outcomes can also be compared separately for different areas of integration. The disaggregated results on individual dimensions of economic integration (columns 2-4) show that East Asia has the second highest value for trade and investment, ranking slightly above North America and below only Western Europe. While North America clearly has the highest result for cross-border mobility, East, Southeast, and West Asia all achieve values that place them within the range of values obtained by South America and Western Europe. When comparing the Pacific islands and Oceania with the Caribbean, both regions obtain very similar results for trade and investment, and for monetary and financial integration, although the Pacific and Oceania have significantly higher values for cross-border mobility (which may be driven by Australia and New Zealand). The largest gap between Western Europe and all other regions appears to be in monetary and financial integration.
As shown in Section IV, these results remain almost unchanged when different weighting schemes of subindicators are used, suggesting that the findings are relatively robust against moderate changes in the construction of the CRI index (Table 4).  Figure 1). c Average of columns 6-7 (see main text). d Normalized economy averages based on corresponding variables described in Table 1. Source: Author's calculations.

D. Data Envelopment Analysis
DEA is a nonparametric approach for estimating production frontiers and can be used to measure relative efficiency rates across a set of comparable units of observation. The method has been applied to a wide range of fields, including an assessment of the efficiency of health and education expenditures in developing countries (Herrera and Pang, 2005) and public sector efficiency in Europe (Afonso et al. 2005). In estimating production inefficiencies, the DEA approach assumes the existence of a convex production frontier defined by the maximal attainable output for a given input level. Efficiency is measured as the distance from the observed input-output combination to the efficient frontier. In particular, a unit is considered to be relatively inefficient if another unit uses less or an equal amount of inputs to generate more or the same amount of output. The range of possible values is from 0 to 1, and all economies located on the frontier are assigned the maximum value of 1.
In the specific context of this study, the underlying intuition behind the applied DEA is that regions that feature the same enabling environment for economic integration (i.e., quality of crossborder infrastructure and institutional arrangements that facilitate multinational private sector activities) should in general also be able to attain similar levels of regional integration outcomes. Any estimated inefficiencies are hence interpreted as untapped potential in regional integration outcomes. It is important to note that the resulting values are based on currently available resources and conditions rather than on potential future developments. The study therefore does not seek to generate forecasts of further integration potential corresponding to possible scenarios of enhancements in economic conditions or political changes. Instead the analysis is designed to compare levels of integration outcomes across different regions and to identify those regions that, relative to others, seem to achieve lower levels of regional integration than they should potentially be able to.
Since all resulting values are estimated relative to the performance of other regions, the corresponding estimates for a specific region are dependent on the set of other regions included in the analysis. This feature can be used to derive different results for the Asian regions corresponding to a lower and upper bound of Asia's regional integration potential. For the derivation of a lower bound, regional integration potential is estimated using only the Asian regions in the analysis. This approach compares input-output combinations across the considered Asian regions and estimates the production possibility frontier based on the most integrated regions in Asia only. Including further regions in the analysis moves the frontier outwards (e.g., as highly integrated European regions are becoming additional possible benchmarks), which increases the resulting estimated values for Asia. The inclusion of all 19 regions leads to the estimation of a current upper bound. 6 In order to apply this approach to estimate untapped regional integration potential, the CRI index constructed above is used as the output variable in the DEA. The considered input variables are chosen as proxies of two relevant dimensions of the enabling environment for regional integration, the quality of cross-border infrastructure, and institutional arrangements that facilitate private sector activities leading to increased economic integration. The data come from the World Bank's Logistics Performance Index and the Doing Business database (see Table 1 for a complete list of the considered variables and data sources). While there are many other possible drivers of regional integration outcomes ranging from geographical features (e.g., distance and natural characteristics) to cultural factors (e.g., common language), this study focuses on conditions that are substantially determinable by governments and policy makers. In addition, the included indicators of cross-border infrastructure (time and cost associated with exporting and importing) may also partially capture geographic conditions, as they represent de facto distances between economies in terms of transportation time and cost. 7 The inclusion of variables from the Doing Business database is based on the view that private sector activities constitute an important driving force of regional integration outcomes (see for example Peng 2002, Yoshimatsu 2002. All input variables are normalized and aggregated to a single input index using the same methods as described above. 6 Note that the upper bound is likely to be underestimated by the DEA approach, since regions located on the frontier have reached 100% of their potential by definition, even though they too may have scope for further enhancement. 7 Following a similar line of reasoning, the measures of customs clearance efficiency, cost, and documents associated with cross-border transportation are likely to also represent the scope of institutional integration achieved in terms of trade agreements and other forms of regional cooperation. Since the outcomes for cross-border trade and mobility are more likely to depend on the actual conditions than those agreed upon in free trade and similar agreements, no additional measure of the institutional conditions is included. The role of other potential factors may be investigated in future studies.

III. ESTIMATION RESULTS
An output-oriented DEA is performed using the software tool DEAP 2.1 (Coelli 1996;Coelli et al. 2005) to estimate each region's untapped integration potential. Figure 2 shows the resulting production possibility frontier for the six considered Asian regions (solid line) and for the full sample of 19 regions (dashed line), corresponding to the lower and upper bound, respectively. The resulting estimates for untapped integration potential are presented in Table 3, along with each region's rank. Larger ranks correspond to smaller estimated values and indicate higher potential for increased integration levels (an estimated value of 1 indicates the region is located on the corresponding frontier).

Figure 2: Regional Integration Frontier
Notes: Plotted lines represent production possibility frontiers for the sample consisting of six Asian regions (solid line) and the full sample of 19 regions (dashed line). See Table 2 and main text for details on the composite regional integration (CRI) index and input index. Source: Author's calculations.
Based on the results for the global sample, South and Central Asia have the largest unused integration potential among the Asian regions. Their scores are around 0.3, suggesting that the two regions are currently only achieving about 30% of their possible integration levels (based on the specification corresponding to an upper bound estimate). East, Southeast, and West Asia all achieve scores of around 0.6, indicating they are relatively nearer to the estimated frontier, but there is still considerable scope for increases in integration levels.
The estimation based solely on the Asian regions yields additional results. With the exception of South Asia, the order of obtained ranks is qualitatively the same, but as expected the absolute estimated scores are much higher (as very integrated regions such as Western Europe are no longer serving as benchmarks). Based on South Asia's input values and currently achieved integration level, the region is at the lower end of the corresponding frontier (with an assigned score of 1). This result highlights that, according to the DEA approach, the regions located at the frontier are assumed to achieve their full potential by definition mainly because no other regions exist in the sample that can serve as a corresponding benchmark. In order to overcome this limitation, the results for the full sample and the Asian specification can be combined to derive a rough assessment of the magnitude of untapped integration potential corresponding to the range between the lower and upper bound. For East and Southeast Asia this yields values between 0 and around 40% of unused potential, while for South Asia the upper bound of untapped potential is 70%.
Based on the results in Table 3, all continents feature regions with considerable untapped integration potential. On average, Europe and America achieve scores slightly above 0.50, which indicates there is still considerable scope for increases in integration (in particular for Southeastern Europe and the Caribbean). Asia's level of regional integration is found to be slightly below half of its estimated potential, representing the largest scope for further increases in regional integration levels in the sample. The results for Africa suggest that the continent is achieving around 70% of its current integration potential. 8 CRI = composite regional integration. Notes: Columns 3-6 report data envelopment analysis (DEA) scores and corresponding ranks based on outputoriented analysis and variable returns to scale (VRS). Input variable: index based on variables from the Doing Business and Logistics Performance Index databases (see Table 2, column 5); output variable: composite regional integration (CRI) index (Table 2, column 1). Source: Author's calculations.

IV. ROBUSTNESS CHECKS
As described in Section II, the construction of a composite regional integration index involves decisions on a number of possible normalization and aggregation methods which may crucially affect the obtained results. In order to test the robustness of the CRI index to different specifications in aggregation, Table 4 shows the resulting CRI values and rankings for different weighting schemes, including principal component analysis. The reported Spearman correlation coefficients represent a measure of the similarity between rankings, where a value of 1 indicates that both rankings are identical, and smaller values imply less agreement (a value of 0 indicates that the rankings are completely independent). Notes: a Simple average, i.e., equal weights assigned to each subindicator (one-third), as in Table 2, column 1. b One-half assigned to trade and investment and one-quarter to each of the two other subindicators. c One-half assigned to monetary and financial and one-quarter to each of the two other subindicators. d One-half assigned to cross-border mobility and one-quarter to each of the two other subindicators. e The Spearman correlation coefficient ranges inside the interval [-1,1] and takes the value 1 if the agreement between two rankings is perfect (i.e., the two rankings are identical), the value 0 if the rankings are completely independent, and the value -1 if one ranking is the reverse of the other. Source: Author's calculations. Notes: a Simple average, i.e., equal weights assigned to each subindicator (one-half), as in Table 2, column 5. b Two-thirds assigned to doing business index and one-third to logistics performance index. c Two-thirds assigned to logistics performance index and one-third to doing business index. d The Spearman correlation coefficient ranges inside the interval [-1,1] and takes the value 1 if the agreement between two rankings is perfect (i.e., the two rankings are identical), the value 0 if the rankings are completely independent, and the value -1 if one ranking is the reverse of the other. Source: Author's calculations.
For most regions, the respective rank changes only very slightly when different weighting schemes are used. Both the standard Pearson correlation coefficient and the Spearman correlation coefficient for rankings are always close to 1 and significant at the 1% level, suggesting that the results are relatively robust against moderate changes in the construction of the CRI index. Analogous results for the constructed input index are shown in Table 5. The resulting rankings are found to be a bit more sensitive to different aggregation methods, but correlation coefficients between the absolute values are always very close to 1. Based on these results, the presented findings in Section III are unlikely to be driven by the specific aggregation methods underlying the construction of the CRI and Input index.

V. CONCLUSION
The empirical findings presented in this paper are able to provide answers to two important questions. How integrated are Asian regions compared with other regions in the world, when looking at multiple dimensions of economic integration? And how large is the untapped potential of Asia's regions for further integration, based on currently available resources and institutional conditions?
Although quantitative magnitudes should be interpreted with caution, as data quality and availability for the considered areas of integration are limited, the resulting relative levels of regional integration outcomes indicated by the constructed composite index seem to be both plausible in comparison to the findings of other studies and robust to moderate changes in the applied construction methods. The presented results provide empirical evidence for the view held by many authors that Europe, in particular the Western European countries belonging to the European Union (EU), constitutes the highest level of integration worldwide (e.g., Ornelas 2010, Baldwin andWyplosz 2006). While currently prevailing stages of institutional integration (e.g., following Balassa 1961) may be considerably lower in Asia than for the EU, the findings based on the CRI index indicate that East and Southeast Asia are achieving effective levels of economic integration that are comparable to those achieved by European regions and for most areas of integration higher than any region in Africa and Latin America.
Including the constructed CRI index as an output variable in DEA suggests that most parts of the world seem to have considerable scope for further integration that is not based on possible future changes in economic conditions or political reforms, but on the current status of available resources and institutions. On average, Asia is estimated to achieve around half of its current potential in regional integration outcomes and South and Central Asia are found to have the largest untapped potential among the Asian regions.
In addition to the purely descriptive results based on the CRI index that allow for a global comparison of currently achieved levels of regional economic integration outcomes, several possible conclusions for Asia can be derived from the presented findings. First, East and Southeast Asia are achieving considerably higher integration outcomes than other Asian regions and may be considered as Asian benchmarks for future policies directed at increasing regional integration levels. However, based on the current level of cross-border infrastructure, institutional environment, and observed integration outcomes, the regions that are facing more similar conditions with the other Asian regions seem to be Northern and Southern Africa, and Central and South America.
Asian regions are achieving comparable levels of integration for cross-border mobility, trade, and investment, but monetary and financial integration seems to be lower than for those regions that overall feature similar CRI levels. South and Central Asia achieve particular low levels, whereas the outcomes for East and Southeast Asia appear to constitute the largest gap across areas of economic integration compared to Western Europe. This highlights the importance of financial and monetary integration in achieving similar composite integration levels as those obtained by the most integrated regions in the sample.
While the analysis has focused exclusively on effective levels of economic integration, the findings may also be used as a basis for discussions on further advances in integration at an institutional level, for example by informing decision makers about current levels of economic integration, and designing policies addressing the identified magnitudes of currently untapped integration potential. Columns 9 and 10 represent index scores based on scales ranging from 1 = lowest to 100 = highest (Doing Business) and 1 = lowest to 5 = highest performance (Logistics Performance Index). Note that many datasets record values only if they exceed a certain threshold (e.g., for cross-border bond and equity holdings, 0 indicates a value of less than $500,000). Source: Author's calculations.