Intellectual and social similarity among scholarly journals: An exploratory comparison of the networks of editors, authors and co-citations

Abstract This paper explores, by using suitable quantitative techniques, to what extent the intellectual proximity among scholarly journals is also proximity in terms of social communities gathered around the journals. Three fields are considered: statistics, economics and information and library sciences. Co-citation networks represent intellectual proximity among journals. The academic communities around the journals are represented by considering the networks of journals generated by authors writing in more than one journal (interlocking authorship: IA), and the networks generated by scholars sitting on the editorial board of more than one journal (interlocking editorship: IE). Dissimilarity matrices are considered to compare the whole structure of the networks. The CC, IE, and IA networks appear to be correlated for the three fields. The strongest correlation is between CC and IA for the three fields. Lower and similar correlations are obtained for CC and IE, and for IE and IA. The CC, IE, and IA networks are then partitioned in communities. Information and library sciences is the field in which communities are more easily detectable, whereas the most difficult field is economics. The degrees of association among the detected communities show that they are not independent. For all the fields, the strongest association is between CC and IA networks; the minimum level of association is between IE and CC. Overall, these results indicate that intellectual proximity is also proximity among authors and among editors of the journals. Thus, the three maps of editorial power, intellectual proximity, and authors communities tell similar stories.


INTRODUCTION
The main objects analyzed in this paper are scholarly journals and communities gathered around them.Scholarly journals have grown in relevance as outlets for communicating research results in the social sciences and humanities (Kulczycki et al., 2018), following a trend that began in the natural sciences a century earlier (Csiszar, 2018).Over the last two decades, in the context of the publish-or-perish environment, where the academic careers of scholars depend more and more on the "quality" of the journals in which they have published their articles, journals have gained a new importance as brands (Heckman & Moktan, 2018).It is therefore hardly surprising that the interest of scientometric scholars for journals is mainly focused on the building of indicators, such as the impact factor, to be used for evaluative purposes (Todeschini & Baccini, 2016).The analysis of scholarly journals as social institutions of science appears less developed.Indeed, scholarly journals connect members of academic communities (Potts et al., 2017).The editorial boards of journals constitute the first layer of such a community.They act as gatekeepers of science: They are, directly or indirectly, responsible for the refereeing processes and they decide which papers are worth publishing in their journals (Crane, 1967;Hoenig, 2015).The stronger the link between the prestige of journals and the career advancement of scholars, the stronger the academic power exercised by the members of an editorial board.From this point of view, it is possible to consider editorial boards as engines of academic power.A possible way to study the role of editors consists in observing the presence of the same editors on the boards of different journals.The network of journals generated by the presence of the same person on the editorial board of more than one journal is called an interlocking editorship (IE) network (Baccini, 2009;Baccini & Barabesi, 2011, 2014;Baccini et al., 2009).Thus, if two journals share the same persons on their editorial boards, it can be assumed that they have at least similar or complementary editorial policies, because they are managed by similar groups of scholars (Baccini et al., 2009).From another perspective, editors have the power to push the paper selection processes toward decisions favoring departmental colleagues, or disciples, and so on (Klein & DiCola, 2004;Laband & Piette, 1994).In this sense the IE network can be used to try to identify favoritism in the refereeing process (Erfanmanesh & Morovati, 2017) or to illustrate the self-referentiality of national communities of scholars (Baccini, 2009).
A second social community gathered around scholarly journals is constituted by the authors of the published articles.Although many studies exist about authorship and co-authorship, only a few are focused on the communities of authors of specific journals (Potts et al., 2017).In turn, it is possible to work analogously to the IE network by considering the journal network generated by scholars authoring papers in different journals.The network among journals generated by the presence of the same authors in different journals could be called the interlocking authorship (IA) network.To the best of our knowledge, this kind of network has been rarely explored (Brogaard et al., 2014;Ni, Sugimoto, & Cronin, 2013;Ni, Sugimoto, & Jiang, 2013).In the IA network, the proximity between two journals can be considered proportional to the number of common authors.Such a proximity is, in some sense, intellectual, because it is based on the choices made by authors on where to publish their papers, and on the decisions of editors to accept or reject those papers.The community of authors around a journal thus reflects to a certain degree the contents of the journal and the activity of the gatekeepers of the journal.If two journals are in proximity, it can be supposed that they have similar contents and that their editorial policies are similar or complementary.
Scholarly journals contribute to the definition of the intellectual landscapes of research fields.Co-citation analysis is probably the best known instrument for studying the intellectual proximity among authors, papers, and journals (Small, 1973).For instance, if two authors are frequently cited together in many different papers, this suggest that these two persons are somehow intellectually connected by the topic or methodology of their work.Similarly, two different journals often cited together in the same paper suggest that these journals are connected.The more often they are cited together, the stronger the link between these authors or journals.We thus obtain a network connecting the journals based on their being cited together often.Let us call this network CC, as it is based on a different measure than those obtained through IE and IA.
In this paper we consider the IE, IA, and CC networks of journals described above and compare the degree of proximity of journals in the three networks.The first intuitive question is to what extent these three networks are similar.If two journals are well connected in the CC network-that is, if they have strong intellectual proximity-does a similar proximity exist in the IA or IE networks?The basic idea is to explore to what extent the social proximity among journals observed in the network of the editorial boards is similar to the social/intellectual proximity observed in the IA network and in the intellectual proximity in the CC network.This question is explored by considering the IE, IA, and CC networks in three fields: economics (EC), statistics (STAT), and information and library science (ILS).Two reasons justify the choice of the three fields.The first is practical: For the three fields, data on the editorial boards of journals were already available because they had been collected by two of the authors in a previous research project.Data on editorial boards has to be collected by hand.Hence, their availability is a big advantage.The second reason is that scholars in the three fields differ in the way they use scholarly journals as outlets for publishing research results.Although in statistics, journal articles are largely dominant, scholars in economics and ILS continue to write book chapters and books (Kulczycki et al., 2018).Hence the similarity analysis considered three different scholarly communication contexts.For each field we compare the three networks as a whole by using suitable statistical techniques.Subsequently, for each field, we partition the three networks in "communities of journals" and we analyze the coherence of these communities between pairs of networks.
The paper is organized as follows.Section 2 describes the network data used in the paper.Section 3 studies the dissimilarities and the generalized distance correlations between networks.Section 4 contains the analysis of the correlation between detected communities.Section 5 discusses the main results and concludes.

JOURNAL NETWORKS DATA
The journal networks considered here are all one-mode (Wasserman & Faust, 1994).In an IE network, nodes are scholarly journals and the edge between two journals indicates that at least one scholar sits on the board of both.Each edge can be weighted by the number of common editors between the linked journals.Analogously, in the IA networks, the edges between journals are generated by common authors and the weight of the edge is the number of common authors.Finally, in a CC network, the edge between journals is generated by the fact that the two journals are cited together in at least one article; the weight of the edge is the number of articles citing the two journals together.
We have constructed the three networks (IE, IA, CC) for the three fields, for a total of nine networks.For IE networks, as anticipated, we used three existing databases, each containing the journal editorial boards in a given year.Details on their collection and normalization can be found in the papers referenced below.Moreover, IA and CC networks were constructed by using Web of Science (WoS) data for a 5-year period, starting from the year for which the IE was recorded.The raw data for the nine one-mode networks can be downloaded from https:// doi.org/10.5281/zenodo.3350797.
For economics, we considered a set of 169 journals listed in the EconLit database and indexed in the Journal Citation Reports for the year 2006.The IE network (Figure 1) was extracted from the database collected by Baccini and Barabesi (2010) for the year 2006.The IA (Figure 2) and CC (Figure 3) networks for economic journals were built on WoS data by considering respectively the authors of and the references in the papers published in the journals in the years 2006-2010.
For the field of statistics, the set includes the 79 journals listed in the category "Statistics and probability" of the Journal Citation Reports for the year 2005.IE data (Figure 4) are the ones collected in Baccini et al. (2009) for the year 2006.Similarly, for the discipline of statistics, IA

Quantitative Science Studies 279
Intellectual and social similarity among scholarly journals (Figure 5) and CC (Figure 6) networks were built using WoS data by considering papers published in the years 2006-2010.
Finally, for the domain of ILS, the set includes the 59 journals listed in the category "Information science and library science" of the Journal Citation Reports for the year 2008.IE data (Figure 7) are the ones collected in Baccini and Barabesi (2011) for the year 2010.Again, IA (Figure 8) and CC (Figure 9) networks were built on WoS data, by considering papers published in the years 2010-2014.
In Figures 1-9 the size of a node is proportional to its degree and the width of an edge is proportional to the value of the link.In the IE network, for example, the size of a node is proportional to the number of journals to which it is linked; the width of the link between two nodes is proportional to the number of their common editors.For each field, the visual  comparison of the three networks is hardly informative.For instance, it is apparent that for all three fields, the IE networks are less connected than the IA and CC networks.Also, in the center of the networks there are not always the same journals, and a journal may have a different size in the three networks.We therefore need a better way of comparing networks.

DISSIMILARITIES AMONG NETWORKS
For each network, it is possible to build a pseudo-measure of the distance among journals by calculating a matrix of dissimilarities.The Jaccard index was adopted as a dissimilarity  measure (for more details on the Jaccard index, see e.g.Levandowsky and Winter, 1971).More precisely, if A and B represent the sets containing the members of the editorial boards of two journals, the Jaccard dissimilarity is defined as

Quantitative Science Studies 283
Intellectual and social similarity among scholarly journals As an example, in the IE network, the similarity among journals is proportional to the number of common editors on their boards.Hence, the minimum dissimilarity J(A, B) = 0 is reached when two journals have exactly the same editorial board (i.e.all the editors of a journal are also the editors of the other and vice versa).The maximum dissimilarity J(A, B) = 1 is reached when two journals have no editors in common.In order to compare the three dissimilarity matrices arising from co-citation, editorial board, and author networks for each discipline, we adopt the generalized distance correlation R d suggested by Omelka and Hudecova (2013) on the basis of the seminal proposal by Székely et al. (2007).It should be remarked that such a correlation index avoids the drawbacks emphasized by Dutilleul et al. (2000) when the classical Mantel coefficient is assumed instead (Omelka & Hudecova, 2013).Hence, we considered the three possible couples of networks and we computed the corresponding values of ffiffiffiffiffi ffi R d p for each discipline.It is worth noting that R d is somehow similar to the squared Pearson correlation coefficient-and hence ffiffiffiffiffi ffi R d p should be interpreted as a generalization of the usual correlation coefficient.More precisely, R d is defined in the interval [0, 1] in such a way that values close to zero indicate no or very weak association, and larger values suggest a stronger association, which is perfect for Rd = 1-and similar considerations obviously hold for ffiffiffiffiffi ffi R d p (for more details, see e.g.Omelka & Hudecova, 2013).The generalized distance correlation was evaluated in the R computing environment (R Core Team, 2018) by using the function dcor in the package energy (Rizzo & Székely, 2018).These values of ffiffiffiffiffi ffi R d p are reported in Table 1.
From the analysis of Table 1, the dependence between the considered dissimilarity matrices is apparent.Indeed, the observed values of ffiffiffiffiffi ffi R d p are greater than (or nearly equal to) the value 0.5 for each combination of networks in the three disciplines.Moreover, the permutation test for assessing independence, as proposed by Omelka and Hudecova (2013), was also carried out.The statistical details of the permutation test are rather involved, even if they are clearly explained by Omelka and Hudecova (2013).Loosely speaking, the rationale behind the test stems from the fact that, under the null hypothesis of independence, the generalized distance correlation should not be affected by a random permutation of the rows and the corresponding columns of the "centred" distance matrices.The permutation principle is widely adopted in order to carry out nonparametric inference, because assumptions are minimal and practical implementation is often straightforward (see e.g.Lehmann & Romano, 2005, Section 10).The permutation test of independence was in turn implemented by using the package energy (Rizzo & Székely, 2018).The significance of the test statistic was computed by means of the R function dcov.test(for more details see Omelka & Hudecova, 2013).On the basis of the achieved p-values given in Table 1, the independence hypotheses can be rejected at the significance level α = 0.01.Because the three statistical tests within each discipline are obviously dependent, we also consider the Bonferroni procedure in order to control the familywise error rate (for more details, see e.g., Bretz et al., 2011).Thus, by assuming such a procedure and a global significance level given by α = 0.01, the marginal independence hypotheses may be rejected if the corresponding p values are less than α/3 = 0.0033, which is the case for all the considered tests-except the editorial board and author networks for ILS.However, it is worth remarking that-even in this case-the corresponding p-value is just slightly larger than the threshold.Hence, the co-citation, editorial board, and author networks display structures which may be considered associated for each considered discipline-at least on the basis of the considered dissimilarity matrices.

CORRELATIONS AMONG COMMUNITIES OF JOURNALS
The proximity among journal networks can be explored by focusing on communities of journals.The first step consists in detecting communities inside each network; the second in verifying the degree of association between the communities detected in different networks of the same field.A nonoverlapping community of nodes of a network is a set of nodes densely connected internally and only sparsely connected with external nodes.Each network is partitioned in communities by using the Louvain algorithm (Blondel et al., 2008) as implemented in the software Pajek (de Nooy et al., 2018).It consists in the optimization of the modularity of the network (Newman, 2004;Newman & Girvan, 2004).The quality of the partition is quantitatively measured by modularity values.Table 2 reports the values of modularities and the resolution parameters adopted for optimization.The resolution parameter is used to control the size of the communities detected; higher values of the parameter produce larger number of communities and vice versa.Table 2 also reports the number of communities detected.
For all the pairs of networks inside each research field, the association between the resulting communities is then analyzed by using statistical techniques as available in Pajek (de Nooy et al., 2018).All the indicators considered are adopted under an exploratory approach.χ 2 statistics provide an index aiming to assess the degree of independence of the partitions of each pair of networks.Cramér's V is a measure of association giving a value between 0 (no association) and +1 (perfect association) (Cramér, 1946).Rajski's coherence (Legendre & Legendre, 1998) is presented in three variants, all defined in the [0, 1] range: a symmetrical version indicating the coherence between each pair of classification and two asymmetrical versions called in Table 3 "Rajski's right" and "Rajski's left."When the communities in the IE-CC networks are considered, Rajski's left indicates the extent to which the first communities classification IE is able to predict the second communities classification CC; Rajski's right indicates instead the extent to which the second classification is able to predict the first.Finally, the adjusted Rand index measures the degree of association between partitions and is bounded between ±1 (Hubert & Arabie, 1985).All indices are reported in Table 3.
For the three fields analyzed here, we observe that the IE is the least dense network and the network with the lowest average degree.For the three fields, the CC networks are in the intermediate position for density and average degree, and finally the IA networks have the highest values of density (0.91 for statistics) and average degree (Wasserman & Faust, 1994).
In general, the community detection algorithm was more successful in sparser networks: For the three fields, the values of modularity are indeed the highest for the IE network, intermediate for CC, and lowest for IA.In the IE networks many detected communities are actually  Information and library sciences is the field where the communities are more easily detectable and more clearly defined, as shown by the highest modularity values and by the lowest values of the E-I indices (Table 2) (de Nooy et al., 2018).In particular for the IA network, communities were detected by adopting a resolution value of 0.8.This resolution was preferred to the value of 1 adopted for all the other networks, because the resulting communities exhibited better E-I indices.With a value of resolution of 1 the network is partitioned into four communities, with modularity 0.257, E-I unweighted = 0.255 and E-I weighted = −0.083.The E-I index was calculated as the difference between the number of edges within communities and the number of edges between communities; that difference is then divided for the total number of edges of the network.The weighted version of the index is calculated by considering the value of the edges.The range of the index is between −1 (all edges are inside communities) and 1 (all edges are between communities).The χ 2 values show that the detected communities for the three networks are not independent.The association between the partitions of communities as measured by Cramér's V is high.The highest level of association as measured by the adjusted Rand's index is found between communities detected in the CC and IA networks.Rajski's right indicates that the communities detected in the IE network predict well the communities detected in the other networks.
The field of statistics is in an intermediate position: The values of modularity are very low for CC and IA networks, but nevertheless the resulting partitions have negative values of the E-I weighted indices.Communities in the IE network are more easily detectable and more clearly defined than in the IA and especially in the CC network.Also in this field, the χ 2 values show that the detected communities for the three networks are not independent.The association between the partitions of communities is a bit higher between CC and IA than for the other pairs of networks.Also in this case Rajski's right indicates that the communities of the IE network predict well the communities in the other two networks.
For the case of economics, community detection is particularly problematic: Small changes of the value of the resolution parameter changed substantially the number of detected communities and the values of the indicators considered.For CC and IA, the community detection procedure results in very low values of modularity and in positive values of E-I.Only for the IE network is the modularity around 0.5 and the value of the E-I weighted index less than zero.Also in this field the χ 2 values show that detected communities for the three networks are not independent.The association between the partitions of communities is the lowest of the three fields analyzed in this paper.Rajski's right indicates also for economics that the communities in the IE network predict the communities in the other two networks, but the values of the index are the lowest of the three.

DISCUSSION AND CONCLUSIONS
The main aim of this paper was to explore, by using suitable quantitative techniques, to what extent the intellectual proximity among scholarly journals is also a proximity in terms of social communities gathered around the journals.
To represent the intellectual proximity among journals we have used the CC network.For information about the academic communities around journals, we have considered the networks of journals generated by authors writing in more than one journal as well as the networks generated by scholars sitting on the editorial boards of more than one journal.The first step in the exploratory analysis consisted of comparing the whole structure of the networks on the basis of dissimilarity matrices.The CC, IE, and IA networks appear to be associated for all the three considered fields.The second step consisted of partitioning the IE, IA, and CC networks into communities and then verifying the degree of association among the detected communities.The results of that analysis show that the communities detected in the three networks are not independent for the three research fields considered.The results of both approaches are coherent in showing that the strongest correlations between networks is between CC and IA for the three fields.Lower and similar correlations were obtained for CC and IE, and for IE and IA.When communities are considered, the strongest association between communities is between CC and IA networks; the minimum level of association is between IE and CC.
To the best of our knowledge, the only similar analysis was performed by Ni, Sugimoto, and Cronin (2013) in their investigation of scholarly communication.They focused on ILS by considering networks of journals generated by common authors, CC, common topics, and common editors.They descriptively compared clusters of journals between networks and calculated a correlation between pairs of matrices by using the quadratic assignment procedure.Their results appear to be coherent with those presented here, because they estimated statistically significant correlations for networks of journals based on authors, co-citation, and editors.
Overall, the results of our analysis show that intellectual proximity is also a proximity among authors and, more surprisingly, among editors of the journals.This leads to the question of whether the structures obtained could ever be independent if the same set of people were predominantly involved in the editorial boards, the publishing of papers, and the citing of papers.In that case the structures are just a consequence of the existence of a publishing and gatekeeping élite in the considered research fields.This is a topic worth investigation by considering the dual networks that we used for generating the nine one-mode networks analyzed in this paper.At the current state of knowledge, it is only possible to affirm that the map of editorial power, the map of intellectual proximity, and the map of author communities tell similar stories.The fact that the results are comparable for the three fields studied suggests that the method presented here is more generally applicable to any scientific field and that there should be in general a coherence among journals at the three scales of editorial boards, authors' choice of publications, and co-citations.

Figure 2 .
Figure 2. Interlocking authorship network of economic journals.

Figure 3 .
Figure 3. Co-citation network of economic journals.

Figure 6 .
Figure 6.Co-citation network of statistical journals.

Figure 8 .
Figure 8. Interlocking authorship network of information and library sciences journals.

Figure 7 .
Figure 7. Interlocking editorship network of information and library sciences journals.

Figure 9 .
Figure 9. Co-citation network of information and library sciences journals.

Table 1 .
Generalized distance correlations between networks

Table 3 .
Association indexes between communities

Table 2 .
Main features of networks and communities i.e. journals with no common editors with other considered journals).In every case, the number of communities detected in the IE networks is always bigger than the number of communities detected in the other networks.