Research data has been called a “first-class citizen” of research output, and some say that it’s equally as important as literature publications. It could be key to addressing reproducibility issues in science and could help researchers expand the insights they glean from each other’s work. In short, research data has a major role to play in accelerating research, and research data management (RDM) can help ensure that the power of the data behind research articles is unlocked.
However, RDM is still finding its way as part of the research workflow, and its full benefits are yet to be realized. Below, members of the Mendeley Data’s Research Data Management Advisory Board share their thoughts on the what the future holds for RDM.
1. RDM made easy
Researchers are already under immense time pressure, and adding steps to the research process won’t do anything to alleviate that. For that reason, David Groenewegen, Director of Research at Monash University, argues that the future of RDM needs to be frictionless:
I’d like to see software that’s smart enough to understand the subtleties of where data is stored and create that connect with other software and processes throughout the researcher lifecycle. This would really help to overcome the messiness caused by having information all over the place.
Bespoke software might appear to be the best solution, but often this won’t work fantastically well, as integrating new processes into existing workflows isn’t easy. RDM isn’t as simple as storing data in a repository. I’m seeing growing recognition of the need to curate data and package it up for later use, so that others can get a decent answer out of it. Most of the tools currently available don’t support this very well.
2. RDM needs a combined approach
Amy Neeser, Consulting and Outreach Lead at University of California Berkeley, focused on the idea that RDM is too big to be the sole responsibility of any one department: “I don’t think RDM can or should be “owned” by one unit or department, such as the library,” she said. “It’s too big an area to be managed alone, and different players bring difference expertise and experience. It calls for a combined effort.”
By way of example, Amy pointed to sensitive data, which can be an obstacle to researchers sharing data. “A lot of the questions that I get are in the active phase of the research lifecycle and often include sensitive data,” she explained. “IT can help with these issues, but also needs the library’s expertise around the beginning (planning, finding) and end (publishing, sharing, preserving) of the research lifecycle to provide researchers with a holistic approach to their scholarship. More researchers from across domains use data and computational resources, and I think IT must be closely aligned with the library and other important players on campus such as the office of research.”
Managing sensitive data is key to progress, she added:
In terms of practicality, I would love to see research data management really focus on sensitive data needs. Currently, this is managed at an institutional level, but it would make a huge impact if there was a nationwide or product-based solution that could address this.
3. RDM solutions need to be modular
As David said, RDM needs to be frictionless, and part of that means allowing researchers to pick and choose which elements of the workflow tools they use:
For me, the modular aspect of Mendeley Data is the most exciting part. You’re not locked into one solution, instead you’re able to plug in different Mendeley Data modules into your own workflows – it’s the way universities like ours want to work.
Amy agreed, saying “I like how the different modules and features available can easily interact with each other. And it’s practical, supporting the data management process.”
4. Metadata for better (and less) data
One of the key elements of successful data management is discoverability, and for that to be possible, high quality metadata is a necessity. Rebecca Koskela, Executive Director of DataONE at University of New Mexico, said:
I still see problems today around data discovery and the need for adequate documentations to re-use data. In 2010, we carried out a survey at DataONE which found that researchers had limited understandings of metadata standard. Unfortunately, even with the emphasis on FAIR data, we still have a long way to go to highlight the significance of metadata. I do like the fact that you can manage different metadata standards with Mendeley Data, and I hope that in the future people will pay attention to the need for quality metadata.
David also highlighted the importance of metadata as a tool for deciding which data to keep when it all gets too much:
Long-term curation and management of shared data is a key area I’d like to see develop. What was considered a lot of data 10 years ago isn’t now, but it’s not feasible to continue buying more storage so that we can keep everything just in case. Improving metadata goes a long way towards addressing this as it enables you to make quick decisions later on, but I’d like to see new processes developed that help us to identify if we no longer require to hold certain data.
5. We’ll need to manage information inequality
Prof. Deb Verhoeven, Canada 150 Research Chair in Gender and Cultural Informatics at the University of Alberta, highlighted that not all trends will be technical, and that RDM may give rise to information inequality:
Information inequality is a big trend we need to consider as part of research data management. Information inequality will occur in two phases. One is around concentration of production, so there will be some people producing a lot of information and a lot of people not producing very much, and that’s a reflection of social inequalities, and the fragmentation of consumption.
That clash between production, consumption and inequality is going to be very profound and something that RDM will need to address. How it does that will depend on other environments, like regulatory environments, and those regulatory environments – particularly in relation to governments will change. When it does we will have two problems – one is around accountability and one is going to be around accounting. Who’s accountable, and who will work out what the impact of that will be?