Earlier this year, we announced our participation in the Data Citation Implementation Pilot and our commitment to implementing proper data citation for our journals in the coming year. Since February, we have made enormous progress and are proud that we recently made data citation available for our journals. Authors publishing with Elsevier are now encouraged to cite the datasets underlying their work in a consistent and persistent way.
As with article, book and web citations, the dataset will be cited at the relevant place in the text of the article, and the reference will appear in the reference list, formatted in the same way as other references. Where possible, the reference will provide a direct link to the stored dataset, making it even easier for both reviewers and readers to access relevant datasets.
Enabling and encouraging data citation
Fuelled in part by funding agencies and research institutions, there is an increasing focus on opening up research data to make science more transparent and reproducible. The challenge in making this happen has been two-fold: firstly, making sure the infrastructure is in place to physically support sharing of research data, and secondly, for researchers to actually start sharing their data. With this in mind, being able to give researchers credit for sharing their data is a critical driver for increasing the availability of research data. This has always been one of the principles underlying Elsevier’s research data policy.
Crediting the creators of data through proper data citations is one of the reasons FORCE 11, a community of scholars, librarians, archivists, publishers and research funders, drafted the Joint Declaration of Data Citation Principles in 2014. These principles offer guidance to make research data an integral part of the scholarly record, properly preserved and easily accessible. When data is recognized as an important outcome of research and when proper data citation is acknowledged as good research practice, researchers can get credit for their data – just like they get credit through citations to their articles. The principles cover the purpose and function of data citation and also describe how authors can cite data in their articles. Elsevier contributed to drafting these principles, which were subsequently endorsed by a large contingent of STM publishers, data repositories and research institutions.
What is needed for data citation?
The support across the community demonstrated agreement around the importance of data citation, but as a publisher, we soon realized that encouraging authors to cite data was only one part of the story. We had to think about the actions needed to actually implement a new type of citation into the scholarly record – and about the infrastructure needed to support this in an author-friendly way.
Data references consist of the elements Author, Title, Year, Data Repository, Version and Persistent Identifier, with the order depending on the reference style. Some of these elements are also present in an article reference, but others are not. These new fields needed to be added in the production system so data citations can be processed correctly when authors submit their manuscripts. In addition, a “tag” to distinguish data references had to be created so data citations can be recognized, which is required to properly display these references or process them in in other systems. This wasn’t just operational; it was also about engaging our colleagues, such as typesetters, to understand what data citations are and update our many procedures. The steps were clear, but with over 2,000 journals using different workflows and reference styles, their implementation presented a challenge.
Working towards implementation and roll-out
In 2016, two important projects around data citation started. In the context of FORCE11, an NIH-funded project on the implementation of the Joint Declaration of Data Citation Principles kicked off. Several expert groups started working on practical guidance, including a publisher’s expert group co-chaired by Elsevier. This group was tasked with developing a roadmap for publishers implementing the data citation principles. Meanwhile, Elsevier started an internal project around transparency and openness, of which data citation is an important part. Because these projects ran in parallel, Elsevier was able to discuss the best approach for implementation with other publishers, while also putting it into practice in a real-life environment.
At this point, we have rolled out data citation to over 1,800 of our journals. To communicate this to authors, we have updated our guides for authors and are also providing information and education on data citation through several communication channels, including the Elsevier Publishing Campus. Based on the data tags that were implemented, we will be able provide valuable feedback to the data community on the progress of data citations.
Benefits for authors
Authors publishing in Elsevier journals are now encouraged to cite datasets that support the conclusions of their articles, and they can do this in an easy and familiar way. Such datasets could be generated and made available by the article authors themselves as part of the research presented. They could also be pre-existing datasets that were re-used for or compared with the research presented.
This development has a number of benefits. Firstly, data citations are a way for users to access a direct link to the stored dataset, making it even easier for both reviewers and readers to access and reuse relevant datasets. Secondly, it will encourage authors to share their data because there is now a way to formally recognize when datasets have been used in the published article.
Data citation will open up possibilities to develop data-specific metrics to enable users to compare performance among datasets in similar disciplines. These metrics would sit alongside others, such as social sharing, deposition rate, views and downloads. This means that in the not too distant future, we can start contributing towards a reward structure for researchers sharing research data.
Data citation webinar
In a webinar on Tuesday, December 6, researchers can learn more about the principles and practice of data citation. Dr. Timothy Clarke, who started the Data Citation Implementation Pilot for FORCE11, will explain the Joint Declaration of Data Citation Principles, why data citation is important, and how data can be made citable. Dr. Helena Cousijn, co-chair of the publisher early adopters group, will explain how you can start citing data today. Learn more and register.
If you still have questions after the webinar, please join our Mendeley Data Citation group, where Timothy Clark, Helena Cousijn and other experts in the field will be happy to answer your questions and discuss the future of data citation.