Who are the acknowledgees? An analysis of gender and academic status

Acknowledgements found in scholarly papers allow for credit attribution of nonauthor contributors. As such, they are associated with a different kind of recognition than authorship. While several studies have shown that social factors affect authorship and citation practices, few analyses have been performed on acknowledgements. Based on 878,250 acknowledgees mentioned in 291,167 papers published between 2015 and 2017, this study analyzes the gender and academic status of individuals named in the acknowledgements of scientific papers. Our results show that gender disparities generally found in authorship can be extended to acknowledgements, and that women are even more underrepresented in acknowledgements section than in authors’ lists. Our findings also show that women acknowledge proportionally more women than men do. Regarding academic status, our results show that acknowledgees who have already published tend to have a higher position in the academic hierarchy compared with all Web of Science (WoS) authors. Taken together, these findings suggest that acknowledgement practices might be associated with academic status and gender.


INTRODUCTION
Acknowledgements found in scientific papers are a public testimony of authors' gratitude and recognition that can reveal contributions of varied nature made by individuals, institutions, and organizations. As such, acknowledgements have been positioned along side authorship and citations as a form of scientific recognition in the "reward triangle" (Cronin, 1995). They also allow for the division of credit among authors and other contributors named in the acknowledgements section. In this sense, acknowledgements can illuminate "sub-authorship collaboration" (Patel, 1973, p.81). In the reward system of science (Merton, 1973)-where authorship constitutes the main means to accumulate "symbolic capital" (Bourdieu, 1975)a mention in the acknowledgements is not associated with the same kind of recognition as authorship. Moreover, given the hierarchical structure of the scientific community, it can be difficult to discern the reason justifying one's presence (or absence) in the authors' list, a n o p e n a c c e s s j o u r n a l because credit attribution can be difficult to disentangle from one's status within the hierarchy (Heffner, 1979). Credit attribution does not exclusively rely on the nature of contributions made, and numerous other factors come into play, such as disciplinary context, structure of the project (Jabbehdari & Walsh, 2017), and one's position in the academic hierarchy (Cole & Cole, 1973;Merton, 1973;Zuckerman, 1977). Among those factors, gender, seniority, and academic status have been shown to have an effect on inclusion in authorship lists (Haeussler & Sauermann, 2013;Larivière, Desrochers, et al. 2016;Lissoni, Montobbio, & Zirulia, 2013). A recent survey of 6,673 researchers provided evidence that discipline, academic rank, and gender were all affecting, to various degrees, authorship disagreements in research teams (Smith, Williams-Jones, et al., 2019).

Nonauthor Collaborators
Authorship criteria have been the subject of numerous discussions in the last decades (e.g., Marušić, Bošnjak, & Jerončić, 2011;Sismondo, 2009;Wager, 2009;Wislar, Flanagin, et al., 2011). However, collaborators who are not authors have received less attention. Shapin's seminal work (1989) has shown that the essential role played by technicians in the scientific development of the 17th century has been completely obliterated from the history of science, as their contributions were not recorded anywhere-reflecting their invisible status at the time.
The professionalization of science has transformed the organization of scientific work, yet technicians' contributions remain invisible in many ways. Heffner (1979) was one of the first to investigate the credit allocation in science using acknowledgements as recognition for contributions. Based on a questionnaire completed by 207 individuals named in acknowledgements of scientific papers (acknowledgees) from social and natural sciences, Heffner found that publication credit is not always attributed on the basis of universalistic principles, and that 12% of respondents reported having been excluded from the authors' list when they felt their contribution warranted authorship. Female PhDs were twice as likely as any other group in his sample (male and non-PhDs) to express exclusion from the authorship list when they believed they should have been named as an author. Laband and Tollison (2000), Laudel (2002), Ponomariov and Boardman (2016), and Bozeman and Youtie (2016) analyzed collaboration beyond lists of authors. Laband and Tollison (2000) compared the number of coauthors (formal collaboration) and the number of individuals mentioned in the acknowledgements (informal collaboration) in economics and biology. Although formal coauthorship was more frequent in biology, informal collaboration appeared as more prominent in economics, demonstrating that disciplinary practices can affect collaboration in its forms and rewards. Based on interviews and publication analysis of 101 researchers, Laudel (2002) found that one third of all collaborators were nonauthors, and were only mentioned in the acknowledgements. Moreover, about half of the contributions were not publicly credited, and therefore remained invisible in formal communication channels. Ponomariov and Boardman (2016) surveyed 1,581 academic researchers and asked about their relationship with their closest collaborators. They showed that there are numerous dimensions to coauthorship and that collaboration often does not lead to coauthored papers. The authors conclude that the "fluid content and boundaries of collaborations" (p. 1959) call for data that go beyond traditional coauthorship lists. Bozeman and Youtie (2016) interviewed and analyzed online posts of US researchers on factors relating to the perceived unwarranted exclusion and inclusion from authors' lists. Their analysis shows that a few interacting variables can explain the perceived exclusion from authorship: the geographic separation of collaborators (especially, the relocation of less-experienced individuals), differential in power and experience, disagreements about the value of technical contributions, and gender dynamics.
More recently, Jabbehdari andWalsh (2017), andPaul-Hus, Mongeon, et al. (2017) investigated the presence of nonauthor collaborators (i.e., those who contributed to a project but do not appear as authors) across research fields. Based on a survey of 1,643 authors, Jabbehdari and Walsh (2017) found that nonauthor collaborators are not rare and that their presence varies by discipline. The highest rates of nonauthor collaborators occurred in engineering and in agricultural sciences, while the lowest occurred in computer science and mathematics, and in physics and space science. Analyzing 362,767 scientific papers and their acknowledgements, Paul-Hus et al. (2017a) found that the mean numbers of acknowledgees (nonauthor collaborators) per paper were the highest in social sciences, biology, and earth and space, and the lowest in mathematics and chemistry. These findings show that traditional differences observed between disciplines in terms of team size are greatly reduced when acknowledgees are taken into account.

The Gender Gap in Acknowledgements
Few studies have looked at the gender of individuals named in the acknowledgements of scientific publications. Hoder-Salmon (1978), Lewis-Beck (1980), and Coates (1999) have discussed the gender issue of scientific credit distribution looking at the contributions of spouses through the analysis of scholarly books' acknowledgements. Moore (1984) investigated the effect of authors' gender on the content of their acknowledgements, and more specifically on the gender of those acknowledged. In a study based on 300 male-authored and 70 female-authored psychology books, Moore (1984) found that while men acknowledged mainly other men, women acknowledged the contributions of both genders. In another analysis, based on 684 psychology articles, Moore (1984) found a lower proportion of female acknowledgees, especially among male authors. Sugimoto and Cronin (2012) obtained the same results while analyzing the scholarly production of six important information scientists. The authors included in their sample were more likely to acknowledge individuals of the same sex, which led Sugimoto and Cronin to conclude that "scholars are more likely to seek (and acknowledge) collaboration, consultation, and guidance from same-sex colleagues" (p. 463).
Looking at the gender of authors and acknowledgees in women's studies, Cronin, Davenport and Martinson (1997) found, as expected given the field, that the vast majority of authors are women (93% of 1,504 authors). When looking at the gender of acknowledgees, they found that 66% of the individuals mentioned in the acknowledgments are women and 20% men, the remainder being either unidentified or unknown. The results also show a higher mean number of acknowledgees per paper in women's studies than in philosophy, history, psychology, and sociology. More recently, Dung, López, et al. (2019) highlighted women's hidden contributions to the field of theoretical population genetics by analyzing programmers acknowledged within articles published between 1970 and 1990 in the journal Theoretical Population Biology. The results (Dung et al., 2019) showed that women are significantly more present within the acknowledged programmers (43.2% of women) as compared to authors (7.4% of women).

Objectives
In focusing on acknowledgees as nonauthor collaborators, the objective of this paper is to better understand how gender and academic status may associate with credit attribution practices in the context of acknowledgements. More specifically, the first goal of this paper is to measure the percentage of acknowledgees who are women, and to assess whether this percentage varies across disciplines and as a function of the gender of the leading authors. The second goal of this paper is to characterize the academic status of acknowledgees who are also authors of other scientific publications (academic age, number of publications, citations, and leading role).

Data
This study is based on all acknowledgements extracted from articles and reviews published between 2015 and 2017, and indexed in the Science Citation Index Expanded (SCI-E) and Social Sciences Citation Index (SSCI) from Clarivate Analytics' Web of Science ( WoS). 1 Access to the WoS data in a relational database format was provided by the Observatoire des Sciences et des Technologies (http://www.ost.uqam.ca). The data set used in the present analysis was extracted from the full text of the acknowledgements sections of papers, and includes 1,045,131 acknowledgements from as many papers. The data set covers all disciplines included in the SCI-E and SSCI. The disciplines of papers were assigned using the NSF field classification of journals (National Science Foundation, 2006); the NSF classification assigns only one discipline specialty to each journal, thus preventing multiple counting of multidisciplinary papers.

Analysis
The extraction of individual names from acknowledgement texts was done using the Stanford Named Entity Recognizer (NER) (Finkel, Grenager, & Manning, 2005) module of the Natural Language ToolKit (NLTK) (Bird, Klein, & Loper, 2009). To obtain the number of acknowledgees per paper, the algorithm was applied to each string of acknowledgement text retrieved from WoS and all named entities tagged as "person" were selected. 2 Several data cleaning procedures were undertaken in order to eliminate nonhuman entities from the list of extracted names. First, incomplete names were removed from the list (occurrences containing only a first or last name, or only initials), retaining only occurrences composed of a complete name (i.e., at least one initial and one last name). We then manually removed all remaining names that did not refer to individual persons, such as grant, foundation, organization, and institution names. Examples of such names removed by manual cleaning include Frederick Banting (grant), Marie Curie (grant and foundation), Boehringer Ingelheim (organization), and Instituto de Salud Carlos III (institution).
Because acknowledgements often contain the name(s) of the author(s) signing the paper from which the acknowledgements are retrieved, a final step of cleaning was necessary. When the name(s) extracted from the acknowledgements of a paper X matched the name of one of the authors appearing in the byline of that paper (using the first initial and the last name), this name was removed from the acknowledgees list for that specific paper, such as in the example below: Paper X Authors: J. Zhang, X. Feng and Y. Xu

Determining the Gender of Authors and Acknowledgees
For the purpose of our analysis, we consider first and corresponding authors as lead authors of a paper, because first authors are generally associated with the highest proportion of tasks performed in a paper (Larivière et al., 2016), and the corresponding author is assigned to the author responsible for correspondence and is often associated with the initial conception and supervision of the research project (Mattsson, Sundberg, & Laget, 2011). If both the first and corresponding authors are women, the paper is considered female-led; if both are men, the paper is considered male-led; and if first and corresponding author are of different genders, the paper is considered mixed.
The gender assignation of personal names (authors and acknowledgees) was done using the Wiki-Gendersort algorithm (Bérubé, Sainte-Marie, et al., 2020). By using Wikipedia pages to get gender information, this algorithm increases the reliability of gender assignation by examining the first names of the names covered by Wikipedia and counting the number of masculine and feminine pronouns in the introduction section of the first 20 pages. Gender is assigned to a first name when the same gender was attributed to 75% of Wikipedia pages. No gender is assigned when this threshold is not met. As shown in Table 1, using the Wiki-Gendersort algorithm we were able to identify the gender of 67% of all personal names mentioned in the acknowledgements of our data set, and of 70% of the authors. The remainder are classified as unknown gender, which includes unisex names. Our analysis of the gender variable uses occurrences of individual names for which a gender could be assigned (Female or Male). Determining whether an acknowledgee is the author of at least one WoS-index article or review is not an easy task, given that we only have acknowledgees' names, and no institutional or disciplinary affiliation (except that of the acknowledging paper). First, all authors names from WoS database were disambiguated using the Caron and van Eck algorithm (2014). Then, for each acknowledgee name, we found all unique disambiguated authors with the same name. We considered an author-acknowledgee match to be valid when there was only one authoracknowledgee pair found in the same discipline or with the same institutional affiliation as the acknowledging paper. We thus focus on precision over recall, as individuals with very common names are almost systematically excluded. As shown in Table 2, following this procedure, 520,932 distinct acknowledgees with at least one WoS publication (article or review) were found.
For each acknowledgee identified as an author, we use the following indicators as proxies for their academic status: • academic age (2017 minus the publication year of the first paper), 3 • number of publications (all publications published until 2017), • total field-normalized citations (based on the aforementioned NSF classification, the total was calculated as the sum of field-normalized citation scores), and • share of the acknowledgees' publications for which he or she has a leading role (first or corresponding author). These indicators are used to measure acknowledgees' position within the academic hierarchy. The results for these indicators are presented as a distribution of values. In order to compare the results for acknowledgees who are also authors, we use the distributions of all authors who had published at least one article in WoS between 2015 and 2017, each author being assigned to the discipline in which he or she has the highest number of publications throughout his or her career. In the event of a tie, one of the tied disciplines was chosen randomly. Figure 1 compares the percentage of women among all authors, acknowledgees, and the subset of acknowledgees who are also authors, by discipline. It shows that the well-known gender gap found in authorship (Larivière, Ni, et al., 2013;West, Jacquet, et al., 2013) is also present in acknowledgements. Women represent less than half of both authors and acknowledgees in all disciplines, with the only exception of Health, where women account for the majority of authors, acknowledgees, and acknowledgees who are also authors. Despite some disciplinary variations, proportions of female acknowledgees and female authors are quite similar (differences ranging from 4.5% between all acknowledgees and authors in Clinical Medicine to −3.1% between all acknowledgees and authors in Social Sciences and Mathematics). All disciplines taken together, women represent 28.4% of all authors, 29.7% of all acknowledgees, and 28.3% of the subset of acknowledgees who are also authors. Table 3 presents the percentage of female acknowledgees as a function of leading author gender. For each discipline, the proportion of female acknowledgees is higher in female-led papers (women as first or corresponding authors) than in the mixed-led papers or male-led papers. The difference in the proportion of female acknowledgees between female-led papers and male-led papers ranges from 23.4% in Health to 11.0% in Biology, with a difference of 18.0% when all disciplines are taken together.  Figure 2 compares the distributions of WoS authors to the subset of acknowledgees who are also authors, as a function of each of the four indicators. 4 It shows that, for all indicators, the distributions of the acknowledgees is less concentrated than that of all WoS authors. Moreover, the acknowledgees' distributions spread on longer tails, with a smaller share of the distributions toward the lowest values for the four indicators.

Academic Status
In terms of number of publications, 80% of all disambiguated WoS authors have fewer than seven publications, while only 30% of acknowledgees have fewer than seven publications (80% of the acknowledgees have fewer than 80 publications). When considering the total field-normalized citations, a similar pattern is observed: 80% of WoS authors have fewer than seven field-normalized citations, while only 27% of acknowledgees have less than seven fieldnormalized citations (80% of acknowledgees have less than 145 field-normalized citations). As for academic age, 80% of WoS authors have an academic age of six years or less, while 22% of acknowledgees have an academic age of 6 years or less (80% of acknowledgees have an academic age of less than 24 years). Both distributions of leading authorships show similar patterns, with important peaks at 0% (no leading authorship), 50% (leading position in half of the authored publications), and 100% (leading position in all authored publications), which is due to the high proportion of researchers having one or two papers. However, the distribution of acknowledgees is once again less concentrated than the WoS distribution, with 20% of the acknowledgees having less than 1% of leading authorships, while 54% of WoS authors have less than 1% of leading authorships. Taken altogether, these results show that acknowledgees who are also authors tend to have a higher position in the academic hierarchy when compared to all WoS authors.

DISCUSSION
Our analyses of the academic status of acknowledgees, measured using numbers of publications, total field-normalized citations, academic age, and share of leading authorships, show that the subset of acknowledgees who are also authors tend to have a higher position in the academic hierarchy compared to all of WoS authors. These findings suggest that academic status may be associated with credit attribution practices, because acknowledgees appear to be rather senior researchers according to our four indicators, at least when considering the subset of acknowledgees who have already published (as defined by having at least one WoS-indexed publication).
Our findings demonstrate that acknowledgements are not limited to research assistants and less-experienced researchers whose contributions to research (often technical) cannot justify authorship, but also extend to researchers of higher academic status, according to our four indicators. Given the higher position in the academic hierarchy of the subset of acknowledgees who have already published, we may consider acknowledgements not only as a form of subauthorship, as has often been conceived (Costas & van Leeuwen, 2012;Díaz-Faes & Bordons, 2014;Patel, 1973), but also as genuine form of credit for informal collaboration with experienced colleagues. Our analysis does not allow us to associate academic status with the nature of the acknowledged contribution. However, it has been shown that senior researchers, when authors of a paper, are more frequently associated with conceptual tasks and resources contributions, while younger researchers are more likely to contribute to experimentation Figure 2. Distribution of researchers (all WoS authors, and acknowledgees who are also authors) by number of publications, total fieldnormalized citations, academic age, and percentage of leading authorships. Note: WoS = All authors who published at least one article or review in WoS between 2015 and 2017; Acknowledgees = Acknowledgees who published at least one article or review in WoS. (Larivière et al., 2016). This division of labor might be mirrored in the acknowledgements. Our findings could thus be explained by the fact that manuscript revision and resource allocation (contributions frequently mentioned in the acknowledgements, Paul-Hus, Díaz-Faes, et al., 2017b) are reserved for researchers with higher seniority. In this sense, acknowledgements to researchers of higher academic status might reveal the "invisible college" of close-but-distant collaborators who contribute in informal ways to a research project (Price & Beaver, 1966). Furthermore, our results may be another manifestation of the Matthew Effect (Merton, 1968), as researchers of higher academic status who already have recognition and visibility in the scientific community tend to be overrepresented among the acknowledgees. When two researchers, one junior and one senior, contribute to a research project without meeting authorship criteria, we might be more inclined to acknowledge the contribution of a senior researcher, precisely because of their seniority.
Regarding the gender of individuals named in the acknowledgements of scientific papers, our analyses have shown that gender disparities generally found in authorship extend to acknowledgements. Globally, women are underrepresented in both authorships and acknowledgements of scientific papers. Furthermore, as found by Moore (1984) and by Sugimoto and Cronin (2012), our findings clearly confirm that women acknowledge proportionally more women than men do. Our results are in line with the gender homophily pattern in team composition and social networks, which refers to the tendency to associate more frequently with same-sex individuals (Ibarra, 1992;McPherson, Smith-Lovin, & Cook, 2001). Women thus appear to be less homophilic than men in their acknowledgement practices. This finding is consistent with previous analyses of gender homophily in scientific collaborations (Araújo et al., 2017;Bozeman & Corley, 2004), showing that men are more likely to collaborate with other men while women are more "egalitarian." However, our results also show important differences between disciplines. These differences in the percentages of women among acknowledgees could also be due to the gender composition in each discipline and the broad categorization of disciplines used in our analyses. In fact, the disciplines in which we observed the highest levels of gender homophily, Health and Professional Fields, both contain research areas generally considered to be highly feminized (Witz, 2013), such as Nursing and Education, as well as male-health oriented areas. In this context, observed gender homophily could be a second-order effect of the gender composition of a discipline if researchers of a given research area acknowledge individuals from their own area. Moreover, observed gender differences and more generally the greater proportion of male acknowledgees within our data set could also be a second-order effect, explained by the overrepresentation of researchers of higher academic status among the acknowledgees. Given the well-known overrepresentation of men in positions of higher rank in the scientific community (Charles, 2003;Etzkowitz, Kemelgor, & Uzzi, 2000), the observed overrepresentation of acknowledgees with higher academic status implies a greater proportion of men among acknowledgees. The gender differences we found could thus be due, at least in part, to second-order effects, without necessarily being a direct reflection of gender-biased acknowledgement practices.

Limitations
Some limitations relating to our data source and methods must be considered when interpreting our results. First, acknowledgement data are limited to funded research, because they are collected with the intended objective of tracking funded research (Web of Science, 2009). Acknowledgements are thus collected and indexed by WoS only if they include some kind of funding information. These indexation criteria could induce a bias toward funded research and funding-related aspects of acknowledgements. Second, the gender assignation algorithm we used, Wiki-Gendersort (Bérubé et al., 2020) presents a limitation that is common to most gender assignment tools: lower reliability for names of Asian origin, and more specifically Chinese names. It is thus safe to suppose that Chinese names are overrepresented in the unknown gender category (Santamaría & Mihaljević, 2018). Finally, our results concerning the academic status of acknowledgees are restricted to individuals who have already published. It is thus reasonable to assume that this subset of acknowledgees might be characterized by a higher academic status than the rest of the acknowledgees who have not published, being either less-experienced researchers or technicians and assistants. As a consequence, our conclusions might not apply to acknowledgees who have not published.

CONCLUSION
Scientific collaboration is often synonymous with coauthorship, despite the fact that it remains a partial indicator of collaboration (Katz & Martin, 1997). However, collaborators are not always authors of the papers to which they have contributed, and acknowledgements can help reveal not only the contribution of these nonauthor collaborators but also their sociodemographic characteristics, making it possible to draw new insights on the social structure of science as well as on practices of collaboration, division of labor, and credit attribution.
Regarding the academic status of acknowledgees, our results show that acknowledgees who have already published tend to have a higher position in the academic hierarchy compared to all of WoS authors. These findings indicate that acknowledgements are not limited to less-experienced researchers whose contributions cannot justify authorship but also extend to more experienced researchers.
Our results also show that women are underrepresented in acknowledgements. In a broader context, academic stereotypes have been shown to act as gatekeepers by steering women away from certain fields (Cheryan, Master, & Meltzoff, 2015). Perceived underrepresentation of women, whether considering authorships or acknowledged contributions, can thus contribute to academic gendered stereotypes and exacerbate gender disparities in both local and global scientific communities (Dung et al., 2019;Larivière et al., 2013;West et al., 2013).

FUNDING INFORMATION
This research was supported by the Social Sciences and Humanities Research Council of Canada: Joseph-Armand Bombardier CGS Doctoral Scholarships (Paul-Hus); Insight Development Grant (Larivière).

DATA AVAILABILITY
Restrictions apply to the availability of the bibliometric data, which is used under license from Clarivate Analytics. Readers can contact Clarivate Analytics at the following URL: http:// clarivate.com/scientific-and-academic-research/research-discovery/web-of-science/. APPENDIX Descriptive statistics for the number of publications, total field-normalized citations, academic age, and percentage of leading authorships, by discipline Note: Ackn = Acknowledgees who are also authors; WoS = All authors who published at least one article or review in WoS between 2015 and 2017. Note: Ackn = Acknowledgees who are also authors; WoS = All authors who published at least one article or review in WoS between 2015 and 2017.  Note: Ackn = Acknowledgees who are also authors; WoS = All authors who published at least one article or review in WoS between 2015 and 2017.