Ready access to data is one of the cornerstones of success in modern-day research. The information underpinning articles offers value to other researchers, and many are optimistic it could provide the solution to the reproducibility problem currently facing science.
To encourage data sharing, funders increasingly attach data conditions to the grants they award. However, with the concept of data sharing still in its infancy, there is a long road to travel before the right infrastructure, guidelines, processes and even incentives are in place to support researchers and their institutions.
To help unlock the full potential of research data, Elsevier developed Mendeley Data, a research data management (RDM) solution for institutions that is designed to fully support researchers, librarians and research administrators with integrated workflow tools for the full data lifecycle.
Last year, the Mendeley Data product team partnered with four universities keen to improve RDM practices and adoption at their institutions.
- Monash University: a public research university based in Melbourne, Australia
- Nanyang Technological University (NTU): a research-intensive public university in Singapore
- Rensselaer Polytechnic Institute (RPI): a private university and the oldest engineering college in the US
- University of Hull: a public research university in Yorkshire, England
The partnership resulted in a multi-stranded pilot project to evaluate and improve Data Monitor – a new Mendeley Data module designed to help institutions gather, organize, track and understand the research data generated by their researchers. More than 500 researchers participated across the four institutions, and more than 1,000 research articles were checked for data availability. Phase one recently drew to a close with the Data Monitor Workshop in Amsterdam in December 2018, and further phases are planned for throughout this year.
6 key lessons learned so far
The 6 key lessons identified in phase one were:
1. There is no single data sharing solution that suits everybody.
There is no one-size-fits-all solution, with data sharing needs varying between universities and even within departments. This became very clear during the early start of the project.
For example, Monash University, Nanyang Technological University and the University of Hull have their own institutional data repositories. However, Rensselaer Polytechnic Institute (RPI) has chosen another path. Dr. Andrew C. White, Director of Library Information Services at RPI, explained:
We see the library playing a different role – one that focuses on aspects such as tracking archival practices of raw research data sets, assigning appropriate ontologies/metadata, providing appropriate linking and checking that the data has been stored in a secure location.
Differences in data sharing practices influenced the focus of the pilots at the different universities.
Thus, RPI and University of Hull decided to initially focus on the data inventory aspects. “The intent of our pilot was not to dictate where a researcher should place research data,” Dr. White said.
Monash decided to go one step further by not only asking researchers where their data is located but also advising how not yet publicly available research data can be shared and linked with their research papers. As Research Infrastructure Librarian Andrew Harrison explained: “We picked data sharing and data linking because this is the behavior we want to strongly encourage our research community to adopt.”
Finally, Nanyang Technological University chose publication of data articles in an open access peer-reviewed data journal as the main focus of their pilot. “This is due to the potential impact on citations,” said Prof. Michael Khor Director of Talent Recruitment & Career Support at NTU.
Increase of citations is one of the stronger RDM propositions that resonates well with researchers across all universities.
2. Engaging with researchers regarding research data sharing is a challenge across all universities.
This is especially true for larger universities like Monash and NTU, which have many more researchers to reach out to and educate about best RDM practices.
“Monash has a small team of three professional staff dedicated to research data management and four metropolitan campuses with nearly 5,000 research staff and another 5,000 post-graduate students to reach,” Harrison said. “Knowing where our researchers are publishing their work is important to us, allowing us to target groups that are not publishing anywhere, as well as allowing us to better understand our researchers’ behaviors and drivers in this regard. … We hope that the pilot will provide a useful tool to reach researchers just at the time that they have some data to share.”
3. Researchers must be able to take ownership of their data and to know what will happen to it if they change institution or if their storage solution is discontinued.
If researchers are to embrace data sharing, they need to be able to benefit from it. Among other things, that means users being able to understand and control where and how it’s stored as they shape their career.
According to Prof. Khor: “Brain circulation is on the rise globally as competition for top research talents escalates. It is crucial that research data remains an entity that is secured amidst the changes in a researcher’s career. Having secured research data is essential for a researcher to assuredly build upon his research wherever he may be.”
4. Many researchers are set in their ways when it comes to managing their research data, and they find it difficult to navigate the numerous solutions available to meet their RDM needs. In addition, they aren’t keen on having more administrative tasks.
Dr. White of RPI explained:
While conducting the pilot, we learned that there is a broad spectrum of research data management (RDM) practices among our researchers. Some of the variation is associated with the multitude of data types that are the results of instrument-specific scientific output. Another factor contributing to RDM methods is a range of experimentation types and processes dependent upon discipline-specific practices. There was a consensus among researchers that practicing RDM added to an already lengthy and complex list of to-dos in research and that there has been little to no follow-up or accountability check from grant funders who are asking for data management plans (DMPs) to be included in grant proposal applications.
5. Senior researchers tend to be less responsive to requests to share data, so future project efforts will focus on earlier career-stage researchers. There was also a desire for fewer emails per researcher, so work will be done to optimize the number of communications.
This topic was addressed during the Data Monitor workshop in Amsterdam as in all three pilots. So far, mainly senior researchers have been targeted.
“We were surprised by the complete lack of response from the research staff contacted, and we are considering the best way to learn from this and reapply the tool in a new pilot with a different audience,” Harrison said. “Experienced middle-career researchers appear not to be a good category for this type of broad email campaign. We are looking to target the early career community that is perhaps hungry to showcase their work to a wider network of peers, who are also more comfortable with the whole concept of being openly online.”
NTU is highly investing in their early-career researchers and supports the proposal that we should focus on them more in the upcoming pilot activities. As Prof. Khor explained:
We place great emphasis on attracting top young research talents to spearhead research fronts. In 2018, the President of NTU, Prof. Subra Suresh, launched the Presidential Post-Doctoral Fellowship (PPF) scheme. The aim is to attract top young post-doctoral fellows to embark on independent research at NTU with the aim of preparing them for a more senior academic position in the near term. Data forms a significant part of our strategy because the nature of research requires researchers to share data and to use data in more than one context. In NTU, we recognize that data is fast becoming an integral part of research integrity and research collaboration due to the increasing multidisciplinary nature of new challenges tackled by researchers.
6. Improving RDM practices at institutions involves not only providing a tool that can be integrated with any existing research data ecosystem, but a program to improve data literacy and establish best RDM practices.
These RDM best practices include educating researchers how to prepare and package their data, what metadata should be included, and how to deal with big and temporal data and proper versioning.
According to Harrison, phase one of the pilot not only provided Monash with a valuable opportunity to understand researcher choices, behaviors and drivers, it was also a first, important step toward building that data literacy:
Monash researchers must comply with grant funder mandates for data management, sharing and reporting and we encourage them to use data management planning at the early stages of their research lifecycle. We are not yet required to report compliance with grant funders’ mandates, but we believe it is important to be ready to do so in the future.
For RPI, a key next step will be working out what help needs to be put in place for the institution’s researchers. Dr. White explained: “Our objective with the pilot was to gauge current research data management attitudes and strategies across the university so we can develop institute-wide recommendations to support them.”
How the pilot works*
- We check Scopus to see if there are new articles published by researchers included in the email campaign.
- Then we check Scholix to see if there any datasets associated with each article.
- If no relevant datasets are found, researchers receive email recommendations about data sharing options.
- The system monitors the progress and generates follow up emails with further guidance.
- The metadata collection for all automatically found and new datasets is integrated with DataCite.
- All collected information is presented via the Data Monitor dashboards.
*For a visual, see the chart at the top of this story.
The pilot offers three data-sharing options; the universities involved could choose the option(s) best suited to their RDM needs.
1. Data inventory request
Articles by researchers at the institution are automatically checked to see if the data associated with them had already been shared. If no data is found, researchers receive an email asking them whether data was available. If they answer yes, they are asked to complete a short questionnaire stating where it was stored. If they answer no, their response is shared with the institution’s librarians.
2. Data sharing and linking requests
Articles by researchers at the institution are automatically checked to see if the data associated with them had already been shared. If no data is found, researchers receive emails encouraging them to openly share their research data and link it to their research article to increase discoverability.
3. Publish a data article request
Researchers receive emails encouraging them to consider publishing information about their openly shared data (e.g., how it was acquired and reused) as a data article in an open access peer-reviewed data journal.
The Data Monitor project is due to complete in December. Results of the pilots have been used to inform the development of Mendeley Data Monitor, a new Mendeley Data module aimed to help research institutions track location of research data generated by their researchers and educate their research personnel about the best RDM practices. This tool can automatically inventory where datasets end up, contact researchers to ask them how their research data associates to published peer-reviewed articles, track various data points (descriptions, file formats, ownership) of these research datasets, and link an article’s bibliographic information in Scopus to relevant datasets to facilitate data discoverability, re-use and citations.
“Everything that we developed so far is thanks to our four partners, who shared great ideas and provided their valuable feedback,” said Dr. Elena Zudilova-Seinstra of the Mendeley Data product team. “We expect Mendeley Data Monitor to be released at the end of 2019 and hope that it will become a useful tool for university libraries and research offices.”