In the midst of a transition towards an Open Access model where publishing costs are supported by the payment of Article Processing Charges (APC), the issue of how to comprehensively collect information about APC payments at institutional research information management systems (either CRIS or institutional repositories) remains unsolved at the moment. Progress around this is currently taking place, mostly in the UK, but the practice is far from becoming widespread yet.
There are two main issues with this information collection process at institutions. The first one is the lack of institutional policies for having this information centrally collected. As a result of this gap in the reporting requirements, estimations by Open Access experts at universities rate the percentage of the APC payments made by institutional researchers that are currently 'seen' by the institution (usually by the library) around 30% (this was the estimation for the Netherlands about a year ago, while for the UK, where there’s an intensive activity for involving the institutions in the process, it was 80%). This is also due to a strong researcher culture of independently dealing with publishers for making these APC payments. Whether this cultural issue arises from the lack of institutional workflows to collect the information or from the researchers' perception that having their institution involved will slow the publishing process by making it more bureaucratic needs to be further investigated, but the fact is that this very relevant information about Open Access costs is often invisible to institutions. This prevents institutions from collecting accurate figures on an important part of their Open Access expenses, and makes it harder to have reliable data to offset subscription vs APC expenses.
The other big issue is the present difficulties that research information management systems (both CRISs and repositories) face to adequately collect information on APC payments. This is mainly due to the fact that there are two parts in an APC payment record, the bibliographic information about the publication and the economic information about the payment. While the publication metadata can effectively be managed by both repositories and CRISs, the economic information will usually get coded into the institutional Finance Module, which will as a rule not talk to the other systems. There have been efforts to target all payments coded in the institutional Finance Module as a means of identifying APC payments, but even this time-consuming exercise fails because data about payments is not regularly coded in a standardised manner (in strong contrast to eg publication metadata). The Finance Module is a completely detached area from the institutional services devoted to scholarly communication, and little harmonisation has reached the workflows as a result.
It could well be the case that the above-mentioned gap in the data collection on APC payments at institutions is a direct result of the inability of the current RIM systems to deal with this information collection process. Early initiatives for reporting on APC payments in the most advanced countries in the area (UK, Germany, Norway) have used customised Excel sheets or Access databases to collect this data, which is a far from optimal solution. This is hardly sustainable and can hinder the progress in cross-institutional reporting, even if UK institutions have already agreed on a common metadata standard for collectively reporting APC payments to the funders which is gradually spreading to other countries.
One of the potentially useful strategies for gradually tackling this research information management gap at institutions would be to have the metadata structure for APC payment reporting modelled in CERIF – which will mean opportunities for CRIS systems, both in-house developed and provided by commercial vendors, to gradually be able to deal with this data. This would in turn allow institutional administrators to require systematic reporting on these payments into the CRIS system by researchers (who could anyway keep their practice of independently dealing with publishers if they wished to).
The result of this discussion is the sketch below for modelling an APC payment in CERIF. It is in fact a rather simple data model enhancement that does not require new entities to be put in place, since CERIF already contains all the required contextual metadata elements.
Recording the APC of an article in CERIF is not technically difficult. CERIF has all the building blocks in place, so they only need to be used to the desired effect.
Assume we want to record the APC of an article. That article is represented as a cfResultPublication instance. Now we insert a cfFunding instance that:
- records the amount (cfAmount) and the currency (cfCurrencyCode) of the APC;
- is typed as “Dissemination Costs” (using a unary classification) – cfFunding_Classification;
- is linked to the article (with a role of “APC”) – cfResultPublication_Funding.
So we know how much the APC was. Another interesting question is – who covered it?
In the end it’s going to be a funder, possibly several of them:
This way the APC can be conveniently tracked in CERIF. CERIF’s flexibility allows for even greater level of detail. One could, for instance, track the APC invoices from the publishers to the paying institutions. We’ll elaborate on this in a later blog post, if there is interest.
Keywords (for tagging if appropriate): CERIF, data model, gold open access, article processing charges, research information management