Can data be peer-reviewed?

Two journals at Elsevier take a stab at giving data a seal of approval

Peer review has been at the core of scientific publication since the first scientific journals came into being. Scientists validate conclusions from one another’s research for the benefit of the greater scientific community.

As more disciplines and funding bodies begin to mandate data sharing as a standard component of scientific dissemination, publishers are eager to provide an avenue for scientific validation of data. Unfortunately, rules for how research data should be evaluated across scientific disciplines are not as straightforward as one might hope.

While funders and scientists agree that sharing data is valuable for the advancement of science, exactly how to share the data and review it are unclear. Two Elsevier data journals are exploring these issues along with members of the research community.

We recently took part in a discussion of how to address the sharing and reuse of data within the European proteomics community. While we could pinpoint ways that tables of protein hits (the core data that emerges from most proteomic studies) could be shared in a standardized way, we kept reminding ourselves that data from proteomic studies also include other data such as images and graphs and original, raw data.

Scientists across all disciplines create a wealth of data in many formats. Unfortunately, as reflected in the data publication pyramid in Figure 1, only a small percentage of this data is actually disseminated to the scientific community via the publication of peer-reviewed research articles.

Figure 1. Different types of data represented in a data pyramid adapted from the Opportunities for Data Exchange (ODE) Report on Integration of data and publications, 2011

An increasing number of data repositories enable researchers to share their data with the scientific community. These data can be accessed directly through the data repositories and in some cases also via the article – for example, through Elsevier’s database linking program. While these data have been publically posted by authors, in many cases they have not been externally validated by peer review, the standard requirement for the publication of other scientific output, such as research articles. There should be a clear distinction between posted and peer-reviewed published data.

Data journals

Scientists need a clear framework to submit data for review and publication. There is a growing awareness in the research community of the importance of this. As an author in one of our recent author surveys said: “I can imagine that extensive datasets collected throughout long and mundane procedures, which sit on the margin of truly creative research (experimental or computational), might turn out to be very useful for other researchers.”

With Genomics Data, a journal focused on data sharing within the genomics community, and Data in Brief, a broad data journal across all scientific disciplines that only publishes data articles, we provide authors with the opportunity to get credit for their data with a publication.  Data articles are articles that solely describe data, and all data they describe must be made publically available either through a public data repository or with the article itself. These data articles ensure that all the data that has previously been left unpublished (Figure 1) now has a validated dissemination venue.

Authors for both journals are required to follow a journal-standardized data article template that aims to highlight metadata in the Specifications Table (below), specify the location and value of the data, and provide materials and methods background and analysis. As a result, reviewers can more easily scan the article to see if the data article does a sufficient job of explaining the associated data.  Published examples of these data articles can be found in Genomics Data and Data in Brief.

Authors for the <em>Genomics Data</em> and <em>Data in Brief</em>  journals are required to follow a journal-standardized data article template to highlight metadata in the Specifications Table, specify the location and value of the data, and provide materials and methods background and analysis.

Reviewing data articles

Reviewing data and data articles is very different from reviewing articles because the data is not judged on its significance but on its utility and potential reuse. Dr. Laurel Larsen, Assistant Professor in the Department of Geography at UC Berkeley, appreciates the spirit behind data journals and says that “as scientists, we need to put more effort into making our data transparent.”

Data reviewers at our data journals are experts in the field who evaluate whether or not a dataset and its corresponding description in a data article match. The criteria we currently use for data review are based on panel discussions held during the 2013 Now and Future of Data Publishing Symposium. Various publishers and representatives of data repositories gathered to discuss what the main principles and best practices for data review/curation should be. Reviewers use those criteria, listed below, when reviewing data and data articles for the Data in Brief journal.

Criteria for evaluating data

  • Do the description and data make sense?
  • Do the authors adequately explain the data’s utility to the community?
  • Are the protocol/references for generating data adequate?
  • Data format: is it standard for the field? Potentially re-usable?
  • Does the article follow the required data article template?
  • Is the data well documented?

We are constantly evaluating these criteria and aim to adopt the best data peer-review practices developed by the scientific community. In general, data peer-review is fast and transparent. If the data is solid and useful, some minor revisions may be needed, but it is important to note that authors are never requested to re-run their experiments or generate a new dataset and then resubmit. Null/negative and intermediate results (or in this case the data underlying those) are as important for these journals as groundbreaking findings.

Bringing data to the forefront

Authors also often publish portions of their datasets as supplementary material. Unfortunately, though the supplementary data is given a stamp of approval as a part of the research article during peer-review, the structure and the visibility of this supplementary data are highly variable. More than 50 Elsevier journals give authors to option to convert supporting data into a data article that they submit alongside their research article. If the research article is accepted for publication, we send the associated data article directly to Data in Brief for a final editorial review. The Data in Briefeditors ask the same questions listed above to review these “transferred” data articles.

We hope that Genomics Data and Data in Brief prove to be good starting points in trying to tackle the peer review of data. We expect the criteria for data peer review through data articles will evolve as we receive feedback from researchers and learn from other publishers about how they address data peer review in their journals.

How to submit your own data article for peer review

The fee for data articles in both journals is $500. Authors who submit in 2015 receive a reduced open access fee of $250.

Experimenting with peer review

Data is not the only tricky research output for peer review. Several other journals at Elsevier are tackling the peer review of fine-tuned protocols and software.

SoftwareX focuses on publication of original software. Reviewers are asked a series of questions and must be convinced that the software runs smoothly and can be used in the way the author claims. If possible, the reviewer will run the software, but often the software is part of a big framework, or needs supercomputing, so the author must prove that the software works. We ask the author to submit videos or audioslides showing the software in action.

MethodsX focuses on methodological details, which requires manuscripts to be evaluated from that particular angle. Reviewers are asked to answer simple questions related to the technical procedure, such as “Are the procedures suggested by the authors plausible?” or “Are the methods clear and logical to follow, so that someone else could reproduce them easily?” This has helped keep the reviews consistent and focused, requiring minimal time investment from the reviewers while providing meaningful feedback to authors.

Elsevier Connect Contributors

Paige Shaklee, PhDDr. Paige Shaklee (@p_shaklee) made her way from studying physics  at Colorado School of Mines to nanoscience at TU Delft to biophysics at Leiden University, where she received her PhD. After doing postdoctoral research in Biochemistry at Stanford University,  she joined Cell Press in 2011 as the Editor of Trends in Biotechnology. Last year, she joined Elsevier's biochemistry publishing team as a Publisher for the Genomics portfolio. She is based in Cambridge,  Massachusetts.

Helena Cousijn, PhDDr. Helena Cousijn obtained a PhD in neuroscience from the University of Oxford, where she developed a strong interest in research data. Having worked with various kinds of data and on several data-related challenges, she is now the Product Manager for Research Data at Elsevier. In this role, she is responsible for finding solutions to help researchers store, share, discover and use data. Helena is based in Amsterdam.

comments powered by Disqus

Related Stories