Article processing charges: Mirroring the citation impact or legacy of the subscription-based model?

Abstract With the ongoing open-access transformation, article processing charges (APCs) are gaining importance as one of the main business models for open-access publishing in scientific journals. This paper analyzes how much of APC pricing can be attributed to journal-related factors. With UK data from OpenAPC (which aggregates fees paid for open-access articles by universities, funders, and research institutions), APCs are explained by the following variables: (a) the “source normalized impact per paper” (SNIP), (b) whether the journal is open access or hybrid, (c) the publisher of the journal, (d) the subject area of the journal, and (e) the year. The results of the multivariate linear regression show that the journal’s impact and hybrid status are the most important factors for the level of APCs. However, the relationship between APC and SNIP is different for open-access journals and hybrid journals. APCs paid to open-access journals were found to be strongly increasing in conjunction with higher journal citation impact, whereas this relationship was observed to be much looser for articles in hybrid journals. This paper goes beyond simple statistics, which have been discussed so far in the literature, by using control variables and applying statistical inference.


INTRODUCTION
The open-access transformation of scientific journals is under way. An increasing number of libraries and library consortia enter into transformative agreements with publishers (offsetting, read-and-publish, publish-and-read agreements) that make individual articles open access. Several models aim at flipping entire journals to open access, such as SCOAP 3 . New open-access journals have been founded by both native open-access publishers and subscription-based publishers. Research funders have tightened the open-access mandates of researchers (e.g., Plan S), which puts pressure on publishers to transform journals. Moreover, they provide funds for paying article processing charges (APCs).
The main motivation behind the open-access transformation is to make publicly funded research results more accessible, which could enhance further research and technological advance. The secondary motivation is to save public money and resolve the serial crisis. However, uncertainty about whether the open-access transformation is financially viable is an obstacle to concrete action, which some policy-oriented reports address.
paper" (SNIP) was first performed by the University of California Libraries (2016), although the regression did not control for any other factors and the statistical significance was not reported. Moreover, the study provides an economic model to explain the rationale for why the perceived quality of a journal is positively related to its APC.
However, all previous literature failed to examine the interdependence between the abovediscussed factors. For example, the finding that APCs for publications in hybrid journals are on average more expensive than APCs in open-access journals could be resolved by the citation impact. Publishers could argue that hybrid journals have on average more citation impact than open-access journals, which are mostly market newcomers, and hybrid journals are therefore more valuable. A further problem with the previous literature is that readers less familiar with statistics could infer causality from correlations, which need not to be the case. Therefore, it is of utmost importance to use multivariate regression analysis and statistical inference for the improvement of our understanding on APC levels.
Throughout this study and consistent with the literature, I define an article processing charge as the fee for the publication of an open-access article in an open-access or hybrid journal. Usually, either the author directly or his or her institution is invoiced. Other fees eventually associated with publishing (e.g., submission, page, or color fees) are not considered as being part of APCs. APCs are charged to publish scientific articles in open access. That means "free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, […] without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself " (BOAI, 2002). Open-access articles may be published either in open-access journals, where the complete content is open access, or in hybrid journals, where only some parts are open access and other parts have closed access and may be accessed via paying a subscription fee. Journals with completely closed-access content are called subscription-based journals. The phrase openaccess transformation refers to the conversion of the publication system from closed access to open access-and within the purpose of this study-for scientific peer-reviewed journals. This paper investigates how much of APC pricing can be attributed to journal-related factors. With data from OpenAPC, which is part of the INTACT project at the Bielefeld University Library, Germany, the APCs actually paid (in contrast to catalog prices) are explained by the following variables: (a) the SNIP of the CWTS Journal Indicators capturing the citation impact of a journal, (b) whether the journal is open access or hybrid, (c) the publisher of the journal, (d) the subject area of the journal, and (e) the year. I performed a multivariate linear regression on the total OpenAPC data set as well as on a subsample of British data from 2014 to 2017 to circumvent the problem of sample-selection bias.
The study design is limited to journals that cover all their costs associated with publishing articles via an author-facing APC. This paper does not aim at explaining why some journals charge an APC and others do not; neither does it aim to explore the total costs of publishing or the operating costs of publishers. In particular, it does not consider free-of-charge journals, which are issued by universities or research institutions, and run via in-kind support. Therefore, the use of OpenAPC, which does not record any free-of-charge publications, is consequential.
The paper is organized as follows. The OpenAPC data set and the CWTS Journal Indicator SNIP are explained and descriptive statistics are presented in Section 2. Section 3 first describes how to circumvent the issue of sample selection bias and then outlines the statistical model. Section 4 presents the results, the consequences of which are discussed in depth in Section 5. Section 6 concludes. OpenAPC is a unique data set on APCs actually paid. OpenAPC is part of the INTACT project, which was funded by the Deutsche Forschungsgemeinschaft (German Research Funding Foundation) and, since October 2018, by the Bundesministerium für Bildung und Forschung (Federal Ministry of Education and Research), Germany. OpenAPC is located at the Bielefeld University Library with contributors from Europe and North America. It aggregates fees paid for open-access articles by universities, funders, and research institutions (see Broschinski and Pieper (2018) for more information on OpenAPC). Among data from numerous German, Swedish, and Norwegian universities and research institutions, OpenAPC aggregates data from the Austrian Fund for Scientific Research (FWF) and the British Wellcome Trust, as well as the Jisc Collections. In version 3.50.3 from January 25, 2019 (Jahn & Broschinski, 2019), which is used in this study, the OpenAPC data set comprises 72,975 observations in total. 2 The OpenAPC data set is a sample of APCs paid by researchers from 13 countries. However, for most countries, the sample is small (particularly the United States) and clearly not representative of the APCs paid by their researchers (for example, Germany). 3 Nevertheless, it is the most comprehensive data collection on APCs actually paid that is publicly available. 4 Most importantly, OpenAPC offers a rather good representation of APCs that were paid in the UK. For the research question and the method, it is not important that the data at hand cover all or most of the APCs actually paid from a country, but that the sample is not biased or skewed to some factors related to both the APC level and the explanatory variables (SNIP, hybrid or open-access journal, publisher, subject area, period.) To my knowledge, there is no such bias for the UK (see Section 3.2 for an in-depth discussion).
For the purpose of this study, the following indicators were used from the OpenAPC data set: • Top-level organization which covered the fee (institution) • Year of payment (period) • APC amount paid, including taxes, discounts etc.; excluding submission fees or page/ color charges (euro) • A Boolean indicator (is_hybrid) on whether the journal is hybrid (true) or gold open access (false) • Publisher (publisher) • Journal title (journal_full_title) Information on the International Standard Serial Number (issn) as well as the linking ISSN (issn_l) is used for merging the OpenAPC data set with the CWTS Journal Indicators. An 2 However, there is one reported APC that is out of realistic scope (above A10,000) and most probably the result of a typing error (misplaced decimal points). Therefore, this observation is deleted from the beginning. 3 A quick glance at the Web of Science Core Collection reveals that there are almost one million gold open-access articles, reviews, and proceedings that were published between 2013 and 2017 in journals by North American and European researchers (including about 150,000 publications of British authors). However, no information is available as to whether these publications were associated with an APC. Although most open-access journals do not charge an APC (as reported by the Directory of Open Access Journals), the most publication-intensive journals, journals that belong to big publishers as well as hybrid journals, largely demand APCs. 4 Maybe publishers have better data, but they are surely confidential. institutional mapping table provided by the OpenAPC project is used to retrieve the country of the institution that covered the fee. More information on the OpenAPC data is given in Section 2.2.

The CWTS journal indicators
Within the research community, the number of published articles as well as the reputation and quality perception of journals in which the articles were published play a major role for career promotion. Journal citation metrics capture or at least try to capture some aspect of a journal's reputation and quality. Publisher emphasize impact factors of their journals to underline their relevance within the research field. In turn, authors frequently use impact factors to decide where to submit a manuscript. It is not the purpose of this paper to analyze or discuss whether journals citation metrics are suitable for research evaluation, career promotion, or subscription to journals. Moreover, I do not answer the question on whether a subscription or publication fee should be linked-from a normative point of view-to the journal's citation impact. I recognize that it does obviously play a role in scientific publishing. The focus of this study is on whether and how the journal's impact is linked to APCs charged, among other journal-related factors.
The indicator of journal citation impact that is used in this study is the "source normalized impact per paper" (SNIP) (CTWS, 2018). It is regularly compiled by the Centre for Science and Technology Studies (CWTS) at Leiden University. The indicator was introduced by Moed (2010) and further developed by Waltman et al. (2013). The SNIP is based on Elsevier's bibliographic database Scopus and uses a source normalized approach to correct for differences in citation practices between scientific fields. This is the main difference between the bestknown indicator "Journal Impact Factor" (JIF) of Clarivate Analytics and the SNIP. The former is based on the Web of Science and is published in the Journal Citation Reports (JCR). Because of disciplinary differences in citation behaviors, it is not appropriate to compare the JIFs of journals between different research fields. The SNIP indicator addresses this problem by taking into account the citation characteristics of the journal's subject field (i.e., frequency that authors cite other papers; rapidity of maturing citation impact; extent to which the database used for the assessment covers the field's literature); see Moed (2010). For this reason, the SNIP-instead of the JIF-is applied within this study. The SNIP score ranges from zero to about 79 points. However, only few journals reach SNIP scores above three or four. By definition, the average SNIP value of the cited journals in a field (weighted by its number of publications) equals one (see Waltman et al., 2013).
The CWTS Journal Indicators were accessed in November 2018, with coverage up to 2017. The variables Source.title, Source.type, Print.ISSN, Electronic.ISSN, ASJC.field.IDs (to retrieve the subject area of the source), Year, and SNIP were used for further analysis. The analysis is limited to journals only. The CWTS Journal Indicators were merged with OpenAPC by using the ISSNs delivered by the respective data set and the ISSN-to-ISSN-L file from July 1, 2018 that contains a table matching all assigned ISSNs with their corresponding linking ISSN (CIEPS, 2018). This procedure delivered the highest match between both data sets. Data points without any ISSN could not be processed further. By merging both data sets, the OpenAPC data was enriched with the SNIP and the subject area of the respective journals.

Descriptive Statistics of the Enriched OpenAPC Data Set
In this section, statistics describing the enriched OpenAPC data set are presented. From this we will learn who mostly paid reported APCs and which publishers and journals received most APC payments. Moreover, we will see how the observations are distributed over the journals' citation impact, subject area, and years. Table 1 provides summary statistics for the discrete variables. Large British universities as well as research funding and research organizations contributed most APC payments to OpenAPC. The last completed reporting year was 2017. 5 In this year, 12,239 APC-funded articles were registered from the UK. The number of observations is rising each year because an increasing number of institutions record APC payments and report them to OpenAPC. The reports from 2018 were incomplete at that time and therefore disregarded in the regression analysis.
In the OpenAPC UK sample, most APC-funded articles were published by Elsevier, Springer Nature, and Wiley-Blackwell-all of them being traditional subscription-based publishers. OpenAPC data suggests that large, traditionally subscription-based publishers dominate the market for open-access publications. Only the Public Library of Science (PLoS) might have noteworthy market shares. In total, OpenAPC reports APC payments to 211 publishers from UK.
However, APC-funded and reported articles were mostly published in the pure open-access megajournal PLOS ONE (about 4% of all articles), followed by Scientific Reports, which belongs to Springer Nature. The journals' subject areas confirm the practical experience that social sciences and humanities play a minor role in APC-based open-access publishing. About two thirds of the APCs were paid to publish an article in a hybrid journal and one third for publication in an open-access journal. Table 2 summarizes the continuous variables APC in euros and SNIP. About half of the articles were published in journals that have SNIP values between 1.1 and 1.8. The average citation impact is about 1.6, which is above the standardized SNIP mean of 1, which is the impact of an average journal in a specific field. Very few articles were published in highimpact journals (see Figure 1). The most prestigious reported journal is The Lancet owned We now turn to a detailed description of the APCs in euros. The mean APC is about A2,300 and the median about A2,200. As one can see in Figure 2, the distribution is right skewed. There are many observations at the lower range, but some observations with high values raise the average APC. Fifty percent of the APC payments range from A1,599 to A2,863. There are few observations above A6,000 (39 from 39,089 payments).
Summary statistics and histograms are also provided for the total sample and the German sample in the supporting information. Most APC payments are reported from the UK, followed by Germany a long way behind. This proportion reflects different reporting behaviors rather than the true size of all APCs paid in these countries. 6 In addition, Austria, Sweden, and Norway reported actively to OpenAPC, although the number of contributed observations has remained low-most probably due to the size of these countries. The increase from 2013 to 2014 was mainly driven by British data.
There are remarkable differences between the British, German, and total sample for some indicators in the OpenAPC data set. However, it should be kept in mind that comparisons between countries might be misleading as, for example, the German sample is definitely not representative of the population. The average APC is higher in the UK and much lower in Germany and the other countries. German APCs are barely above A2,000, most probably due to the APC-funding rules (price cap). In the total sample, almost half of reported APCs stem from publications in hybrid journals, but only 1% in the German sample. On average, British authors published in journals with more citation impact than German authors did. These differences largely reflect different APC-funding rules in the countries. APC funding in Germany is much more restrictive than in the UK. My interest is in explaining the APC-pricing behavior of publishers in general-not (yet) the influence of funding policies. Therefore, the regression analysis in Section 4.1 builds upon the UK sample.  represents an article with its combination of APC and SNIP. The line shows the correlation between the two variables. Although the positive correlation seems to be weak, it is statistically highly significant (test statistic not reported here). Hence, articles in higher impact journals are charged more than in lower impact journals.       There are wide differences in APCs levels between the publishers, as one can see in the box plots of Figure 7. The median as well as the upper and lower quartiles of APC payments are the highest for Elsevier, followed by Oxford University Press. This means that these two publishers often charge expensive APCs. APCs are relatively low at PLoS, and they do not vary as much as at the other big publishers.

Statistical Model
In this section, we will further investigate the factors behind the different APC levels. A multivariate, linear regression analysis is performed, where the independent variables SNIP, Hybrid, Big_publisher, Subject_area, and the year γ explain the dependent variable APC: (1) The variable Big_publisher is a column vector of dummy variables indicating the six largest publishers according to the respective sample of OpenAPC. The base group contains all other publishers. Likewise, Subject_area is a column vector of the four subject areas to which each journal is assigned, where health sciences is the base group. β 4 and β 5 are the corresponding vectors of coefficients, α i is the individual-specific effect, and it is the disturbance term. The subscripts i and t denote the ith observation at the tth period. Moreover, I expect that the explanatory power of SNIP is different for hybrid and open-access journals. That is why the estimation equation contains an interaction term between SNIP and Hybrid. 7 To illustrate the interpretation of the coefficient of Hybrid and its interaction term with SNIP, conditional expectations of Eq. (1) are presented. For open-access journals, the conditional expectation is For hybrid journals, the conditional expectation of Eq. (1) is Hence, β 2 induces an intercept shift and β 3 induces a slope shift.
The OpenAPC data set is not a panel but a repeated cross-section. That means that data are obtained by a sequence of independent samples, where the unit of each sample is the article. I performed a static linear regression with random and time effects based on T successive crosssections. Therefore, heteroskedasticity had to be taken into account and robust standard errors were calculated for hypotheses tests. 8 Eq. (1) is estimated by pooled ordinary least squares (OLS) and the results are reported in Section 4. 9

Sample Selection
Two issues arise that could lead to biased coefficient estimates: sample selection and missing data. The first issue arises if the sample at hand is not representative of the population. This would render OLS parameter estimates inconsistent (Cameron & Trivedi, 2006, p. 529). In our case, we observe a sample of APCs, but for some countries, the sample is not randomly drawn from the population, as high APCs are systematically underreported to the OpenAPC project. In Germany, the Deutsche Forschungsgemeinschaft (DFG)-a funding organization-supports publication funds at some universities. If a member of the university is the submitting or corresponding author of an article in an open-access journal, the publication fund can take over the obligation to pay the APC up to A2,000. The APC must not be above this limit to be covered by the DFG-supported publication fund. Otherwise, the author has to pay the APC out of department, third-party, or private funds. Publication funds systematically report to the OpenAPC project, whereas there are almost no ways to report otherwise-funded APCs. To make things worse, authors could choose not to publish in expensive open-access journals at all but to publish in subscription-based journals. Having this in mind, it is possible to infer the determinants for APCs up to A2,000 but not above. The sample selection could be more 7 I also considered nonlinear relationships between APC and SNIP. However, it turned out that linearization is not necessary. 8 See Cameron and Trivedi (2006, p. 47 and pp. 770-771) for a discussion on repeated cross-sections. 9 The results were obtained using R 3.4.3 (R Core Team, 2017) with the packages lmtest 0.9-35 (Zeileis & Hothorn, 2002), sandwich 2.4-0 (Zeileis, 2004), car 2.1-6 (Fox & Weisberg, 2011), texreg 1.36.23 (Leifeld, 2013) and xtable 1.8-2 (Dahl, 2016). or less severe depending on the national conditions for APC funding. The stricter the conditions (e.g., a price cap) the less representative the sample is likely to be. To my knowledge, the conditions for APC funding are least restrictive in the UK. There are no price caps for APCs, and APCs are funded for publications in both open-access and hybrid journals. Fortunately, the OpenAPC data set contains plenty of UK data from 2014 to 2017, so that I can base the entire analysis on the UK sample (see Section 4.1), largely avoiding the problem of sample selection and inconsistent estimates.
The second issue that could lead to biased coefficient estimates arises if the data set has missing observations. In the UK sample, there are approximately 3% of observations with missing citation impact (SNIP) and subject area (e.g., life sciences). I assessed the direction and range of the potential bias on the estimation results due to missing data and report them in the supporting information. Missing data turned out to be a minor problem.

Results of the UK Sample
Turning now to the estimates of Eq. 1, Table 3 shows the results of four models based on the UK sample from 2014 to 2017. 10 The first model is a bivariate regression of APC on SNIP, which already explains 11% of the total variance. In the second model, APC levels are explained by whether the article was published in a hybrid or open-access journal. Indeed, APCs in hybrid journals are more expensive. This variable explains 11% of the total variance in a bivariate regression. Combining both variables (including their interactions term) represents Model 3, where 24% of the total variance is explained and all coefficients are statistically significant. The coefficient of SNIP is about A888, which means that, on average, an open-access journal with a SNIP value of 2 charges about A888 more than an open-access journal with a SNIP value of 1 (other things being equal). Likewise, a hybrid journal is estimed to charge, on average, about A1,550 more than an open-access journal (again, other things being equal). However, a hybrid journal is less sensitive to its impact. For each additional SNIP score, it charges just about A209 (≈ 888 − 679) more. To sum up, hybrid journals tend to be more expensive and less sensitive to their citation impact than open-access journals. In Model 4, the total set of variables is included to explain APC levels. The dummy variables indicating a big publisher, the subject area, and the year do not add as much to the adjusted R 2 . Nevertheless, most coefficients are statistically significant and economically substantial. Publishing in Elsevier journals is quite expensive (on top of the fact that most Elsevier journals are hybrid) and least expensive in PLoS journals. Publishers might follow different price-setting strategies, or some reputation is associated with a publisher label, which is not reflected in the SNIP. Publications in life sciences are much costlier than in social sciences and humanities.
Furthermore, the results seem to indicate a price increase from 2014 to 2015. There are several potential explanations for this finding, which cannot be identified from this research 10 Inspecting residuals (not reported here) shows no serious problems with outliers. However, for economic reasoning, I decided to disregard the lowest and highest 1% of APCs from the UK sample as outliers. On the one hand, the lowest 1% of APCs are likely not stand alone APCs (below A304) because the minimum cost for publishing an article in a reliable journal is well above this amount. Publishing in reliable journals, these APCs could be subsidized by organizations, or discounted APCs because of waivers or personal membership in scientific communities or learned societies. On the other hand, the highest 1% of APCs are most probably the result of typing error (above A5,349). List-price APCs do not exceed this amount, even at the most expensive journals. design. First, it might be that at least one major or many publishers increased APCs in 2015. Second, exchange rates evolved unfavorably concerning the euro (i.e., the euro devaluated against the pound sterling, the US dollar, or other currencies that publishers use for billing purposes). Third, it cannot be ruled out that the currency conversion made by OpenAPC is slightly inaccurate, as annual average spot rates are used if no information on the day or month of payment is available, and therefore information is lost. However, this would only result in a problem if APC payments were unequally distributed over a year. Of course, APC development over the years indicated by the estimates could have resulted from a combination of these three potential explanations, either cumulative or alleviative. Without doubt, exchange rate shifts have a substantial effect on the level of APC payments if APC bills are denominated in foreign currencies. Because price increases and exchange rate movements are interesting and important topics on their own, I will further investigate them in a follow-up paper. In this study, the period dummies are mainly used for controlling purposes. This ensures that the results concerning the journal-related factors (SNIP, subject area, etc.) are not biased by periodspecific effects or trends.
To make the results more clear, estimated APC equations are presented for two publishers (representing two opposite extremes), 11 which both produced journals in life sciences in 2017. Eq. (4) predicts an APC for an open-access article at PLoS depending on the impact of the respective journal:Â Eq. (5) predicts an APC for an open-access article in an Elsevier hybrid journal, other things being equal:Â On the one hand, the APC's component that is not related to the journal's impact is almost four times higher for publications in Elsevier hybrid journals (A2,708) than at PLoS (A566). On the other hand, Elsevier is estimated to charge just A203 for each SNIP score, compared to A844 by PLoS. In the end, it depends on the journal's impact whether an article published by PLoS or by an Elsevier hybrid journal is predicted to be more expensive. Assume a SNIP score 11 After controlling for other effects, Elsevier appears as the most and PLoS the least expensive publisher. These publishers have the highest and the lowest estimated coefficient, respectively. of one, which is the citation impact of an average journal in a specific field by definition. For example, the journals PLOS ONE and Molecular and Cellular Endocrinology had a SNIP of approx. one in 2017, 12 both located in life sciences. Then, we can derive the following estimated APCs: • PLOS ONE article:ÂPC it = 566 + 844 = A1410 • Molecular and Cellular Endocrinology article:ÂPC it = 2708 + 203 = A2911 These are examples for in-sample predictions. In Table 4, predicted APCs are presented for PLoS journals and Elsevier hybrid journals with varying levels of citation impact. A SNIP value of one corresponds approximately to the first quartile of the UK sample as well as the total OpenAPC data set. The median of the UK sample is 1.35 and 1.78 is its third quartile. A SNIP value of 15 is about the highest impact a journal has in the OpenAPC data set (The Lancet). However, no gold open-access journal has a comparable citation impact. The predicted APCs vary greatly for PloS journals along the citation impact but only slightly for Elsevier hybrid journals. Eighty percent of the reported articles from the UK appeared in journals with a SNIP score below 2. For them, APCs in hybrid journals are predicted to be much costlier than in the open-access counterparts.
To conclude, the journal's impact mirrors APCs in open-access journals, especially at openaccess publishers, far better than in hybrid journals, particularly those that are published by the big, traditionally subscription-based publishers. Table 5 presents the regression results of two models based on the total sample. In Model 2, country dummy variables are added to account for country-specific effects (Austria is the baseline country), but their interpretation can be questioned due to the sample-selection problem. The overall findings are the same, but the magnitudes of the coefficients differ somewhat. Because of the sample-selection problem, my conclusions are drawn from the UK sample (Model 4 in Table 3).

DISCUSSION
The purpose of this paper was to identify publishers' APC-pricing behavior according to some characteristics of their journals. The results provide evidence that the journal's citation impact Note: The in-sample APC prediction for an open-access journal with a SNIP score of 15 is are rather hypothetical consideration, as no open-access journal has comparable impact.  Solomon and Björk (2012), Björk and Solomon (2014), and the University of California Libraries (2016), my analysis shows that there is a positive and statistically significant relationship between the citation impact and the requested APC-for both open-access and hybrid journals. In fact, two pricing patterns emerge. The journal's impact greatly influences APC levels in open-access journals, whereas it slightly alters APCs in hybrid journals. In open-access journals, each additional SNIP score is associated with a A845 higher APC but only with A203 more in hybrid journals. In this respect, my regression analysis confirms the insights from descriptive statistics of Romeu et al. (2014), who have found that APCs are much more strongly correlated with the JIF in open-access journals than in hybrid journals. The University of California Libraries (2016) were the first to perform a regression analysis, albeit on a small sample, without any controls and no reported significance levels. Their finding that each additional SNIP point is associated with an approximately $710 higher APC in open-access journals fits surprisingly well with the results of my analysis. Björk and Solomon (2014), Jahn and Tullney (2016), and the University of California Libraries (2016) argue that APCs in hybrid journals are on average higher than in open-access journals. However, they did not control for the journal citation impact. In fact, I present convincing evidence that the fraction of the APC that is not related to the citation impact is much higher for publications in hybrid journals compared to open-access journals (additional A1,530 for the base group). Moreover, my data suggest that the native open-access publisher PLoS tends to charge less than traditional subscription-based publishers (Elsevier and Springer Nature) for comparable journals, which is in line with the conclusions of the literature (Björk & Solomon 2014;Jahn & Tullney, 2016). In addition, I can confirm the influence of the scientific discipline on APCs found by Solomon and Björk (2012) and the University of California Libraries (2016). APCs for publications in life and health sciences are more expensive than in physical sciences and least expensive in social sciences and humanities, even when controlled for other journal-related factors. To sum up, hybrid journals tend to be more expensive and are less sensitive to their citation impact than open-access journals. With reference to the title of this paper, the evidence suggests that APCs are mirroring the citation impact in open-access journals, especially at native open-access publishers, but are a legacy of the subscription-based model in hybrid journals, often at Elsevier, Springer Nature, and co.
Overall, this paper largely confirms the previous knowledge obtained from descriptive statistics on the relations between APCs and journal attributes. This paper's main contribution is to control for interdependencies between the above-discussed factors. To isolate the marginal effect of one variable (e.g., citation impact) on APCs, it is necessary to take into account the other relationships that might influence APCs. This was done in the regression analysis. Moreover, with the help of statistical inference, it is possible to calculate confidence intervals and perform significance tests on the observed relationships. Provided the APC equation is correctly specified, this paper • demonstrates that the relationship between APCs and the other variables is not random, and • shows the magnitude (in euros) of the marginal effect of each variable on the APC level.
The estimated equation could be used to predict APCs (in euros) for currently closed-access journals or for journals for which we lack APC information. Moreover, the estimated equation can help to answer two questions relevant for policy design and for making strategic decisions in libraries. The first is how much hybrid journals would charge if they flipped to open-access and adopted the open-access price-setting behavior. The second is how much open-access journals would charge if they adopted the hybrid pricing-setting behavior.
To get an idea of what the two pricing patterns imply for the financial aspects of the openaccess transformation, I calculated two hypothetical scenarios. What would have been the total APC amount if all articles recorded in OpenAPC had been charged as if they were published in open-access journals? And what would be the sum if they were all published in hybrid journals (leaving other journal characteristics unchanged)? Table 6 presents the hypothetical amounts in euros for the UK sample from 2014 to 2017 and the total sample and compares it with the actual sums. The calculations show that the UK higher education and research system would have saved more than A11 million on OpenAPC-recorded articles if all journals had charged according to the open-access pricing pattern. In contrast, all countries would have spent about A25 million more on APCs if all articles recorded in OpenAPC had been charged according to the hybrid pattern. The effects for all APCs paid from these countries would have been even higher.
Which pricing behavior will dominate in the future after a full journal flipping is crucial. If the pricing behavior of the traditional, subscription-based publishers wins through, the open-access transformation will come at a much higher cost than expected today from libraries, higher education, and research institutions. Therefore, provisions to introduce competition between publishers and journals are of utmost importance.
The rationale for linking APCs to the journal citation impact is clearly research evaluation. Currently, the evaluation of individual researchers and even entire higher education and research institutions depends much on the use of journal citation metrics as the SNIP or the JIF. Consequently, researchers tend to pursue SNIP/JIF maximizing publishing and probably pay every APC they can afford or their funder takes over. If research funders and higher education and research institutions could make a shift toward assessment based on researchers' own achievements rather than on the basis of the journal in which the research is published, 13 APCs in journals will likely be more level and competitive than observed today.

CONCLUSION
APCs are gaining importance as one of the main business models for open-access publishing in journals. By investigating the journal-related factors influencing APC levels, this paper presented key findings that could be used to assess whether the open-access transformation of journals is a financially viable way for individual higher education and research institutions as well as entire countries.
The results show that the journal's impact and the hybrid status are the most important factors for the level of an APC. However, the journal's impact alters the APCs little for publications in hybrid journals, whereas it is crucial for the level of APCs in open-access journals. The journal's subject area and publisher also affect APCs. Moreover, the year of payment influences APCs, although this paper cannot identify whether it is because of price increases or exchange rate movements. To date, it remains an open question how (country-specific) conditions for research and open-access funding interact with APCs.

ACKNOWLEDGMENTS
The author gratefully acknowledges the constructive comments of two anonymous referees, which helped to greatly improve this final version.