How can you optimize research data management (RDM) at your institution?
28 September 2023
By Linda Willems
In a recent series of webinars, librarians and data specialists shared best practice tips for building a robust RDM program
There is a growing sense of urgency around research data management (RDM). In recent years, funders, governments and publishers have introduced policies designed to promote effective management and open sharing of research data. At the same time, we’ve seen a rapid rise in the volume of scholarly publications with research data attached to them.
But if your institution is at the start of its RDM journey, what is the best path to take? And what are the potential pitfalls? In a recent Elsevier webinar series on RDM, librarians and data experts shared some of their learnings and key dos and don’ts. Topics they covered included:
Project managing the introduction of an institutional data repository
Compiling compelling RDM policy guidelines
The importance of tracking, preserving and showcasing research data
The difference between open and FAIR data
Finding (and capturing) research data hosted outside your institution
Engaging researchers in good RDM practice
In this article, we take a closer look at just two of the points they touched upon.
1. Good metadata is the cornerstone of effective RDM
Comprehensive schemas can improve research linking
Metadata is a set of data that describes other data and enriches it with information that makes it easier to find, use and manage. For example, in the case of a research data record, the metadata typically includes fields such as the creator’s name, the year of publication and the publisher.
That’s why the non-profit DataCite, which works with more than 2,700 repositories worldwide to give research outputs DOIs (digital object identifiers), asks users to complete a comprehensive metadata schema. As Paul Vierkant, the organization’s Outreach Manager, explains: “Although it might seem quite a dry topic, metadata is really important. Researchers and institutions need to provide as much contextual information as possible so that people who find that data output in the future aren’t left with any questions.”
According to Paul, used in conjunction with persistent identifiers like DOIs, metadata build and expose connections. “They enable you to see all the links that are important to create this contextual information so that you can find out more about the publication and the context of the publication and dataset, but also about the people and institutions behind them.”
Metadata increases the visibility of researchers and their institutions
Paul believes that to improve the quality of metadata captured, institutions need to highlight the benefits of good metadata practice to researchers. “We should showcase more what is in it for them. Put a focus on the fact that good metadata is linked to academic search engine optimization. Tell them that there is higher visibility for their research, possibly resulting in higher impact.”
And he stresses that good metadata is crucial for institutions too, as it raises the visibility of their research outputs at a time when many national evaluation programs are asking universities to show that their research data is open and/or FAIR (Findable, Accessible, Interoperable and Reusable).
Getting the right tools and training in place is critical
Hester Kamstra and Wiebe van der Meer are CRIS (current research information system) metadata specialists at the University of Groningen Library in the Netherlands. With less than 10 percent of the datasets their researchers generate deposited in Groningen’s own institutional data repository, they rely on Elsevier’s Data Monitor to help them locate the remaining 90+ percent. Data Monitor harvests research data from 2,000+ general and domain-specific repositories and then cleans and enriches the metadata by adding publication, author and institution links.
Hester says: “Alongside publications, journal articles, book reviews etc. we view datasets as a very important product of our researchers that deserve as much, if not more visibility.”
Before they started working with Data Monitor, they considered making an API connection between the general repository Dataverse NL and Groningen’s CRIS (they use Elsevier’s Pure). She explains: “Dataverse is the main external repository we advise our researchers to use, largely because it has good metadata quality.”
But Hester and Wiebe knew that some researchers (and indeed funders) favor domain-specific repositories “because that’s where their colleagues hang out and share data. So, just making an API for Dataverse wasn’t a feasible option.” However, not every external repository takes the same approach to metadata. And the ability of the Groningen Library and Data Monitor to find external datasets is shaped by the quality of the metadata captured. To combat this, Hester and Wiebe are now working with the university’s data stewards – who are recruited from within the various faculties – on an education program for researchers. Hester explains: “Completeness of data and data quality vary a lot between repositories so the quality of input by our researchers and data stewards is important there. If they input high quality data into a ‘messier’ repository, we can still find their datasets.”
2. RDM is a true community effort
Identify partners and define terms
For Nynke de Groot, cross-department consultation was the natural first step when her university decided to introduce a new institutional data repository. Nynke, who is the Research Data Management Specialist at Erasmus University Rotterdam in the Netherlands, explains: “You need a multidisciplinary team in place from the outset – if your building blocks aren’t straight, you can’t build a tower.” She recommends connecting relevant departments or disciplines, then “creating a shared vision and strategy that combines your organizational and user needs. With vision, you want to think about how you want to profile the university and your researchers – or how they can profile themselves. And you should be thinking about these things before determining what kind of tools or software you need.
But working with other departments is not without its challenges, as Nynke shares: “You have to be very clear about the scope and set the definition for the terms used. That last point might seem straightforward, but we had a really interesting discussion with our IT dept about what a back-up is – from a data archive perspective, we want to preserve for all time a copy of the original files put in our archive, and perhaps different versions when alterations were made. It doesn’t matter whether that original version was put in yesterday or 20 years ago. For the IT department, a back-up means everything is backed up from five days ago, so bye first or different versions!”
For Nynke, involving other teams also helps to get their buy-in. “Everybody needs to feel responsible for the outcome. And you can make prominent users advocates for innovation. We have open science champions who advocate among their peers and we see that they are very good in reaching their colleagues.”
Look to other institutions for inspiration and support
Nynke also urges institutions to consider contacting other universities: “They have the same struggles so don’t feel you are the odd one out – together you know more than alone.”
It’s a view echoed by Bill Ayres, Strategic Lead for Research Data Management at the University of Manchester Library in the UK. “When we implemented our research data repository and looked at the options for integration with our Pure CRIS, I don’t think we were aware that there were already institutions working in these areas that could help us,” he recalls. “The Library has these amazing network and collegiate connections across so many areas and that’s something we really try to lean into now.” He adds: “Looking back, I would probably have asked more questions and gone to more of these types of webinars. Luckily, we did attend an Elsevier product update which talked about Data Monitor and that gave us the connection with Elsevier to understand how Data Monitor could help us.”
For DataCite’s Paul, it’s not only departments and institutions that need to work together, it’s the entire research community. “To make research outputs and resources discoverable and usable in the future it will take all of us; organizations like DataCite, publishers, institutions and funders. We need to work as a community to make the care of research data the norm.”
It’s not too late to catch Elsevier’s RDM webinar series
The three sessions are now available on demand.
The webinar content also forms the basis of a factsheet for librarians, now available to download.