Investing in invisible infrastructure
The hidden projects and technologies Elsevier supports to improve services
By Alicia Wise, PhD, Holly J. Falk-Krzesinski, PhD, and David Tempest, MBA Posted on 7 March 2016
An important role of a publisher is to ensure published content is available in perpetuity, across platforms, accessible to those who need it and easily identifiable. With millions of articles being published every year, this is a huge technical challenge.
To this end, there’s a hidden – but significant – part of this story: investing in infrastructure and open standards to ensure content can flow seamlessly around the web and across platforms. Like all kinds of infrastructure, it’s invisible if it’s working; you don’t think about pipes unless they leak, or roads unless you hit a pothole. At Elsevier, investing in this technical infrastructure and participating in open standards is a crucial part of being a publisher.
As a modern publisher, we don’t just print articles or slap them on the web as a PDF; we add value by making the articles more discoverable and usable so people can actually interact with the information instead of just reading it – and ultimately find answers to their questions. We need to maintain different versions of each article – PDF, html, xml – so they can be added to repositories and databases and made more discoverable and searchable. This means we need to ensure the metadata is accurate and detailed enough for people to use it effectively and for search engines to find it. We work with an array of standards bodies, including the National Information Standards Organization (NISO), a nonprofit association that develops technical standards to manage information like this, helping us support information retrieval, re-purposing, storage, metadata and preservation.
Collaborating within Elsevier
Building infrastructure takes expertise and ongoing efforts to understand the needs of users and other stakeholders. At Elsevier, our colleagues are working with the global research community on a wide array of initiatives. These employees come from teams across our organization, including:
- Access and Policy
- Cell Press
- Content Innovation
- Disruptive Technology
- Global Academic Relations
- Innovation & Product Development
- Research Applications & Platform
- Research Integrity
- Research Metrics
- STM Journals
- The Lancet
By collaborating with other publishers and organizations like this around the world, Elsevier is contributing to building and maintaining this vital infrastructure so researchers and institutions can enjoy its benefits without even noticing it. This is one of the ways we reinvest in the community and incorporate the results into our solutions. We have recently mapped all the infrastructure and standards projects we’re involved in – here are some highlights.
Starting from scratch
Elsevier has been involved in some of these projects since the beginning as a founding partner. One is Crossref. The first task we worked on with Crossref was to develop the Digital Object Identifier (DOI) system: a way of assigning a unique code to every article published. It’s actually incredible when you think about it – a unique code that enables you to find any published article. This was one of the first systems that supported linking between scientific articles, not just from the same publisher but among publishers. Because of the DOI, we can find every article quickly and easily. It’s powerful and empowering for researchers, but it’s nearly invisible – especially how it works “behind the scenes.”
We have since worked on other projects through Crossref, including tackling plagiarism, something publishers are very concerned about. Several publishers worked together to launch CrossCheck, an anti-plagiarism tool. It enables editors to scan submitted documents to make sure they haven’t appeared elsewhere in the literature. This helps publishers make sure we’re protecting against unethical practices, and it also helps editors maintain the standards of the journals they work on.
Then there’s CrossMark, another output of CrossRef, which shows readers whether they’re seeing the most recent version of an article or if there have been updates since it was posted. It acts as a verification tool for the research and ensures researchers are using the right information as input for their own work. Elsevier activated CrossMark across all journals in 2015.
Collaborating for individuality
Articles need to be easily identifiable, but so do people. There are thousands of names that are the same or similar, and it can be difficult for researchers to distinguish themselves from others. This can be problematic: researchers may identify the wrong potential collaborators or could miss an opportunity because someone approached their namesake. That’s the purpose of ORCID, which enables researchers to get a unique, persistent number (an ORCID iD) they can use to distinguish themselves in the world of research and connect their various research activities.
We worked with a group of publishers and other stakeholders to develop ORCID; because it’s a cross-publisher initiative, researchers can use the same ORCID iD everywhere. Our partnership was actionable: on the day ORCID was launched, Elsevier launched the Scopus to ORCID Wizard, enabling researchers to set up their ORCID iD if they didn’t already have one, match it to their Scopus record and search for data using the iD. ORCID is a major breakthrough; it solves so many issues researchers have been troubled with for years.
Unique codes aren’t only helpful for articles and researchers – they can be useful for organizations, too. With the growth of open access, funding bodies have become even more important stakeholders. Most major funding bodies have open access policies and mandates in place, and many pay for the research they fund to be published open access. Researchers have to identify the source of their funding when they publish, so identifying the right funding body is important.
Elsevier also spearheaded the creation of the Crossref Open Funder Registry, a taxonomy of international funder names, IDs, abbreviations, alternate names and countries. This code can be used to track how many articles are being published as a result of research funded by particular organizations, for example, or to track publications resulting from publicly funded research. Elsevier had created the original funder list for one of its commercial products and then donated it to Crossref so they could share it with the community via a CC0 license. Elsevier continues to curate the list for the initiative.
Commitment to open science
Open access is having other impacts too, a major one being that it’s opening up research on the Internet, helping to make content more accessible and discoverable. With OA publishing, the author normally retains the copyright of the article; when that’s the case, they need a way of telling the people who use their article exactly what they can do with it. Instead of reinventing the wheel, Elsevier has adopted a licensing framework to give people options: Creative Commons licensing. The system has been used for music, images and publishing for years, so many people are already familiar with the way it works.
Creative Commons licensing gives authors a choice so they can decide how much freedom they want to give people to use their work. As a publisher, we are responsible for making sure the options are clear, so we put together a summary of what each license means for the author’s work.
Open access is just one aspect of the open science movement: open data, open metrics and open standards all contribute to making science more accessible. This – and indeed all the work we do with industry bodies to develop open standards and infrastructure – goes hand in hand with transparency.
We need to be transparent, especially in the case of metrics being used for ranking and determining impact. Researchers, the majority of whom work at academic institutions, are subject to performance and tenure processes that rely on metrics and impact measurements. There has to be a transparency that mirrors that of the research itself in order for us to support these processes.
As we move forward, technology will play an increasingly central role in what we do as a publisher. Some of the newer initiatives Elsevier is involved in are pushing the technological boundary to help take research to the next phase.
One such example is text and data mining. It’s impossible for a researcher to find and read every single relevant piece of published scholarly content to inform their work; new content is being generated too fast for that to be a feasible approach to literature research today. Here is Elsevier's introductory video about text and data mining.
Machines can now do in a few hours what would take a person years. As a publisher, it’s important for Elsevier to support researchers who want to mine huge volumes of data for the information they’re looking for, saving them years of research time. But we have to do it in a way that readers on different sites don’t notice any disruption. That’s where the invisible technology comes in. At Elsevier, we have a “self-service portal” that lets researchers log in and mine all our published content, following a license-based approach. In addition, we are part of the Crossref Text and Data Mining Service, a multi-publisher partnership that makes content across publishers available for TDM.
One of the advantages Elsevier has is its size and expertise. We don’t just publish journals, we provide solutions that span the breadth of the research process, from the generation of ideas to the measurement of citations and impact. This puts us in a unique position to share our expertise in these industry partnerships and also trial the proposed standards and infrastructure.
One example is Project CRediT. Led by the Consortia Advancing Standards in Research Administration Information (CASRAI), CRediT is a standard taxonomy for recognizing researchers’ contribution in the research output, such as published articles. It helps make researchers’ roles clearer, giving more transparent input into the impact measurement of their work.
As well as working with CASRAI to develop the taxonomy, Elsevier is also piloting CRediT in several Cell Press journals and inviting authors to provide feedback. When an author submits a manuscript, they can choose whether to use the new schema in addition to the traditional authorship format. We ask authors who chose to use CRediT why they did so, what the advantages were and whether they would suggest any changes. We ask authors who chose not to use it why. This sort of direct feedback from researchers helps us develop new infrastructure and standards that are most useful and make decisions that benefit researchers, without creating unnecessary burden.
There will undoubtedly be many more projects in the future, and if they’re successful, you’re unlikely to even notice them. Invisible infrastructure is a big investment, but it means we can make life easier for authors and readers, helping them find what – and who – they need and save them time that’s better spent on their research.
Elsevier Connect Contributors
As Director of Access and Policy for Elsevier, Dr. Alicia Wise (@wisealic) is responsible for delivering Elsevier's vision for universal access to high-quality scientific publications. She leads strategy and policy in areas such as open access, philanthropic access programs, content accessibility, and access technologies. Based in Oxford, she has a PhD in anthropology from the University of North Carolina at Chapel Hill.
Dr. Holly J. Falk-Krzesinski (@hfalk14) is VP of Global Academic Relations for Elsevier, where she focuses on how insight from data and analytics guide strategic planning for the research enterprise. Her engagement activities emphasize building new relationships and strategic alliances around important issues for research and research training, such as those related to research analytics and strategic planning; economic development; early career researcher development; scholarly communication and open access/open data; research and faculty information management; expertise discovery and collaboration; and research metrics and impact.
Through her leadership with the Annual International Science of Team Science Conference, Dr. Falk-Krzesinski has been instrumental in developing a strong community of practice for team science and interdisciplinary research. She is also involved in broadly promoting women in STEM and gender in research.
David Tempest is Director of Access Relations at Elsevier in the UK. His role involves working with funding organisations and academic institutes on the development of relationships surrounding access initiatives. In addition, David also works on strategy and implementation of Elsevier’s open access programme. He has worked at Elsevier for over 20 years, having previously been in editorial, marketing and market research positions.
Tempest is a frequent presenter at various events around the world. His main subject is speaking about the development of open access initiatives and technologies, as well as publishing matters in general. He has a BSc in pharmacology from the University of Sunderland and an MBA with distinction from Oxford Brookes University.