“Datasets are an important output in their own right”: How OpenAIRE, euroCRIS and Elsevier are stimulating open science

An increased metadata exchange between Elsevier's research solutions and OpenAIRE will make datasets more available and reusable

Elsevier collab image

If you’re a researcher with a Horizon 2020 grant, you are obliged to publish your research under an open access model. What’s more, since a few years ago, in most cases you’ve also had to make you data output publicly accessible. The same is true for many other grants.

The OpenAIRE (Open Access Infrastructure for Research in Europe) platform supports this policy and boosts open science in Europe and across the world. To help researchers comply with regulations and make datasets more discoverable and (re)usable, Elsevier’s core platforms automatically send information about publications and research data to OpenAIRE, which also acquires links via other trusted sources, such as DataCite, CrossRef EventData, and several data repositories worldwide.

Now, an increased metadata exchange between Elsevier's research solutions and OpenAire will make datasets more available and (re)usable. This development will make tracking the impact of research easier and remove the burden from researchers of providing metadata multiple times from researchers, and it will favor a change in the way research is evaluated.

Through OpenAIRE, researchers, repository managers and research administrators can share and retrieve information about research funded by the European Union. This data includes researchers, their institutional affiliations, the journal a study was published in and related research data and software. This makes it possible to see, for example, which grant a particular study was funded by or which publications and research data resulted from a specific grant. Anyone can browse this information through OpenAIRE Explore.

Anna ClementsLike Elsevier’s other platforms, including ScienceDirect, Scopus and Mendeley Data, Elsevier’s research information management system Pure has supported OpenAIRE for several years. Pure collects, manages and visualizes information about institutions’ research outputs, activities and impact, pulling in and aggregating research information from numerous internal and external sources of an institute. Now, Pure is increasing the metadata exchange with OpenAIRE using the open standard CERIF . That will make the process of complying with regulations even more effective and efficient for researchers.

Anna Clements, Assistant Director of Digital Research at the library of the University of St. Andrews in Scotland, was a driving force behind the integration. “The team from Elsevier was very keen to work with Pure customers and euroCRIS to set up the data feed,” she said.

Scientific reward

Being able to connect publications with datasets and grants and other information benefits different stakeholders in the world of research. For example, a few clicks can show you all the publications and datasets pertaining to one particular grant. This can be useful for monitoring and reporting purposes.

Another advantage is that the links are made automatically. This means that the quality will likely be higher than when academics must upload this information themselves.

“The quality of metadata is generally very bad because scientists don’t always spend enough time on it,” Clements said. “The automation removes the burden from researchers to upload this information and saves them time that they could spend on their actual research.”

“It benefits academia in general,” Clements added.

OpenAIRE uses a format to identify and collect scholarly metadata, CERIF, developed by the European organization euroCRIS. This format goes beyond those of a traditional repository because it includes information about funding projects, datasets, people and their organizations and even equipment.

“The OpenAIRE integration helps researchers make their datasets more available and more usable,”  Clements said. “What’s more, because you can easily see all publications that resulted from a certain dataset, this could impact the way that research is evaluated.”

Scientific reward is an important element to consider given that the focus on article publications as the main product for the valorization of science has led to the “publish or perish” culture that pervades academia. “OpenAIRE gives more opportunity for the datasets to be seen as an important output in their own right,” Clements said. This is one of the key ambitions of the open science movement.

The service fits in Elsevier’s vision of an open, agnostic and interoperable system supporting research, in which tools from different providers work together to provide a seamless experience for the user.

Standardized boxes

Paolo Manghi“Previously, Elsevier worked with individual publishers and data centers to arrange links between data, but this was becoming more and more cumbersome,” said Paolo Manghi, Technical Director of OpenAIRE. “Together, we defined best practices on how to exchange information about links between articles and datasets. As trivial as this may seem, it is not.”

For example, describing datasets can be quite a challenge because each field has its own interpretation of what research data is. In the social sciences, datasets are usually simple files, but in other fields, datasets are time series that are stored in databases.

“It took us quite a while because we used a participatory approach that included data centers, DataCite, Crossref, Elsevier, OpenAIRE,” Manghi continued. Repositories call the OpenAIRE service Scholexplorer and can get back all the datasets that are connected to a given article and show them on the article’s page – for example, on Scopus.

Crucial in this process is that all the data is modelled in the same way. Picture this: you want to fill a truck to the brim with boxes. Then it is in your best interest that all these boxes are the same so you can make the best use of the space. Similarly, if information from different sources is stored in a central location, it should be organized according to the same principles.

This is most easily enabled now by the CERIF standard euroCRIS developed.

Ed Simons, PhD“By using CERIF, we put metadata from systems like Pure in a standardized box and deliver them to the recipient, in this case OpenAIRE,” said euroCRIS President Dr. Ed Simons. Each type of metadata gets a specific tag, and the relationships between metadata is also defined and stored. This creates a nearly endless stream of information about who is doing which research where and how.

“The fact that Pure, a market leader, is implementing our standard is a substantial step towards the standardization of metadata exchanges,” he said. “The more organizations use this standard, the greater the benefit for scholarly communication.”

Clements added:

I am extremely keen that Pure remains an interoperable software solution – I’m always going on about this.

The information system supporting research

While the technology revolution has brought significant advancements by making an unprecedented amount of data available to researchers, this abundance also presents significant challenges. With more and more data sets and knowledge extracted from that data, it's nearly impossible to decide what is useful for any particular piece of research.

We think of the universe of the tools researchers have at their disposal to execute these tasks as the “information system supporting research.” Today’s challenge is that the existing information system is both outdated and fragmented across many applications and resources, often burdening researchers instead of supporting them.

We invite you to visit our resource center to learn more and explore how we might collectively improve the information system so it meets the needs of researchers.

Quick question for you

Which terms do you most associate with Elsevier? (check all that apply)

Data and analytics
Research platforms
Technology
Decision support tools
Publishing
Books and journals
Scientific articles
Healthcare content

Tags


Contributors


Comments


comments powered by Disqus