Data citation is becoming real with FORCE11 and Elsevier
At Boston workshop, Elsevier detailed its efforts to credit researchers who make their data available by supporting research data citation
By Mike Taylor Posted on 15 February 2016
At Elsevier, one of our key activities involves creating new ways to support the posting, publishing and citation of research data. This is crucial to encouraging re-use of research data and enabling the reproducibility of published research. Publishing data alongside the traditional article publications means a new kind of citation is needed: a citation to the research data set.
What is research data, and why make it available?
Research data is usually considered the primary data source that supports scholarly research, and access to the data is necessary to validate research findings and results. All digital and non-digital content has the potential to become research data. The data might be derived from any channel – it might come from observations and experiments, it might be public access, it might be data that’s already been used in previous research. The community understands that making data available is a time-consuming activity, and providing for its formal publishing and citation is a key stage in rewarding the researchers for making the data available.
By partnering with organizations like Force11 — a community of scholars, librarians, archivists, publishers and research funders – we’ve been able to lead the way in this very important space.
Publication and citation are accepted by the scholarly community as the principal way of acknowledging the value and impact of a researcher’s contributions to the development of knowledge in their field. The practice of selecting items for publication and subsequently for citation is immensely valuable and trusted by researchers globally, but it has been difficult until now to apply it to non-traditional research output.
Elsevier is committed to promoting the validation and publication of a broader spectrum of research outputs, such as data sets and software, and crediting their contributors, and we are working with many community organizations to enable the publication and citation of data across our journals and research intelligence products. We have journals for peer-reviewed data, such as Data in Brief, and we support the posting and sharing of non-peer-reviewed data on data.mendeley.com.
Citation will open up the possibilities of developing data-specific metrics to enable users to compare performance among datasets in similar disciplines. These metrics will sit alongside others, such as social sharing, deposition rate, usage and downloads.
FORCE11 and Data Citation
At the FORCE11 Data Citation Workshop on February 2 in Boston, Dr. William Gunn, Director of Scholarly Communication for Mendeley, and I were joined by individuals representing publishing, data repositories, research and the broader scholarly communication community. The workshop was designed to stimulate and promote data citation and co-ordinate pilot projects. Members heard from invited experts working on implementation, including Joan Starr, Senior Project Manager at California Digital Library (CDL); Dr. Patricia Cruse, Executive Director of DataCite; Dr. Mercè Crosas, Chief Data Science & Technology Officer at the Institute for Quantitative Social Science (IQSS) at Harvard; and Dr. Maryann Martone, professor and Co-Director of the National Center for Microscopy and Imaging Research University of California, San Diego.
“Elsevier is having an important impact on the movement to promote sound, reproducible scholarship,” said Dr. Tim Clark, Assistant Professor at Harvard Medical School and leader of the workshop. “Their contribution towards implementing data citation and data metrics is particularly welcomed.”
Elsevier has been among the first organisations to embrace research data as a crucial research output of researchers and thus also to contribute to the development of community-supported data citation principles, having first endorsed the Joint Declaration of Data Citation Principles in 2015.
Elsevier will also be among the first to complete the implementation of those principles: we are proud to announce that the work needed to implement proper data citation throughout our journals and principal products is underway, with an ambition of achieving full implementation in 2017.
Progress towards research data deposition and citation has been swift and largely orchestrated by FORCE11, a community of researchers, funders, publishers and others interested in shaping the future of scholarly communication. In February 2014, FORCE11 published their Data Citation Principles, and in May 2015, FORCE11 members contributed to a significant paper that drilled into some of the steps needed to facilitate the identification and discovery of data as a citable object. (Starr et al: “Achieving human and machine accessibility of cited data in scholarly publications.”)
Research Elements: Elsevier’s data and code journals
To make it easier for researchers to get credit for the work they’ve done preparing and conducting their experiments, Elsevier launched a series of peer-reviewed journal titles grouped under the umbrella of Research Elements that allow researchers to publish their data, software, materials and methods and other elements of the research cycle in a brief article format.
On publication, all Research Elements are assigned with persistent identifiers for direct citation and easy discoverability. Persistent identifiers are also used for interlinking Research Elements and relevant research papers published in traditional journals
“Research Data is of the utmost importance for researchers in their advancement of knowledge,” said Dr. Philippe Terheggen, Managing Director, Elsevier Journals. “Therefore, Elsevier fully supports the proper discovery and citation of such data, and we will offer it to researchers wherever possible.”
Implementing data citation
Implementing research data as new type of publication and enabling data citation from traditional publications to the datasets is no trivial task. From the ongoing work to implement data policies in our 2,500+ journals via our editorial systems of typesetting, tagging and quality management, to the way a piece of research data or citation to it appears on a publication’s page with the proper tags and links, there are many external and internal stakeholders who have to agree on each step. Having documents such as the Data Citation Principles and associated work provides a valuable blueprint for this work, and we’re proud to play a part in this community endeavour.
Research data metrics
With proper citation comes useful metrics to fully credit the creators of the research data output cited. I’m also a co-chair of a NISO working group, which will be making recommendations in the data metrics space, bringing together work from multiple community organizations, including the Research Data Alliance (RDA), CASRAI, COUNTER, FORCE11 and others. Elsevier is planning to implement a set of research data metrics as soon as research data becomes available in sufficient quantity; our metrics will not only derive from citation activity but usage data (links and downloads) and sharing data (Mendeley, Twitter, news, blogs and other sources).
Elsevier’s implementation of research data publishing, viewing and citation with accompanying metrics so researchers can be properly credited for their efforts will be complex, and we will be working on the launch into 2017. In the meantime, we are fully committed to working with the community through FORCE11’s data citation expert groups, several of which were set up at this pilot workshop. We hope these expert groups will become a valuable resource for repositories, publishers and research data creators to exchange information.
Disclaimer: Mike Taylor is a listed as an author on Starr et al.
Elsevier Connect Contributor
Mike Taylor (@herrison) has been part of Elsevier for almost 20 years, and has been involved in altmetrics for almost five years. Currently, he is Senior Product Manager in the Research Metrics group. He co-chairs the NISO Working Group on metrics for non-traditional research outputs, and is one of the founders and organizers of the Altmetrics Conference series.