VOICE: erasing silos to multiply discovery
2026年6月2日 | 5 分経過
Ann-Marie Roche 別
By creating a shared data infrastructure, Elsevier’s VOICE project seeks to better capture the complex relationships that define modern scientific knowledge.
George Georghiouopens in new tab/window doesn’t believe in silos – or filters.
When he was still an Elsevier customer working at Novartis’sopens in new tab/window corporate library, he took pride in being brutally honest about any shortcomings from Elsevier or other vendors, such as data that wasn’t as clean and interoperable as the sales team claimed.
So, it was a surprise when he was hired two years ago as Senior Knowledge Strategy Manager for Data Science in Elsevier's Life Science team. “I came in with a whole list of complaints. And 90% of them were well-received. So that made me happy,” George laughs.
Meanwhile, his job has been about breaking down silos as the lead for a project called ‘Vision for Ontological Interoperability & Content Enhancement’ – or VOICE.
George Georghiou,Senior Knowledge Strategy Manager for Data Science in Elsevier's Life Science team
From silos to networks
VOICE is a major internal effort to streamline taxonomy creation across Elsevier’s products, ensuring all databases use consistent ontologies and can communicate effectively with each other, with the ultimate goal of integrating with public-domain databases.
For George, it’s one step closer to building his dream: a FAIRopens in new tab/window and integrated world of data. “This has actually been a dream of mine for a long time,” says George. “The more you can connect, the more chance you have to find the right connected facts that researchers can turn into real insights that push their work forward.”
This interview has been edited for length and clarity.
Let’s start with your background. How did you transition from biochemistry to knowledge management?
I originally started as a lab scientist, completed my PhD, and then began my postdoc. After being in denial, I realized that the lab wasn’t for me and that I needed a change. I wasn’t sure what it was yet, but I knew I had to do something different. So, I looked for the first job I could find. The first to email me back was UniProtopens in new tab/window, a protein database well-known among biochemists that I had used throughout my PhD and postdoc. They were looking for a data curator, and since I had done some data curation during my PhD, I figured I’d fake it until I made it [laughter].
That job introduced me to the Gene Ontology annotation projectopens in new tab/window, which fascinated me. You’re essentially trying to explain to a computer how biology works by creating hierarchies – starting from broad terms like “biological process” and narrowing down to specific processes like oxidation, then defining what oxidation means and its parts. You have to communicate this in ways that both humans and computers can understand, enabling machine learning and AI to use knowledge graphs and make inferences about how things may be connected.
Later, at Novartis, I introduced the idea of combining text mining, data analytics and data science to uncover content using ontologies. I could show people: “Here's the information you've been looking for... Let me demonstrate how I can integrate different elements... For instance, this is what your competitors have been doing...”
Can you give an example of a real-world use case?
My favorite was a proof-of-concept dashboard that processed 10 years of biomedical literature abstracts from PubMedopens in new tab/window. It extracted authors, therapeutic targets, diseases, drugs and associated companies, and tracked how companies’ publication patterns in specific therapeutic areas correlated with actual market performance and sales data. It was fascinating to see everything line up. It was probably too ambitious for its time, but it was still a fun experiment to show, “Hey, people, by the way, have you seen the potential here?”
How did you end up at Elsevier, especially given your previous relationship as a customer at Novartis?
As I mentioned, my relationship with Elsevier wasn’t very positive at first [laughter]. The real problem was that I was dealing with marketing and salespeople. And while these people were perfectly nice, when it comes to data, you want to talk to those who generate it since they speak the same language.
Anyway, when I announced I was leaving Novartis, I reached out to my network. I knew people at SciBiteopens in new tab/window – many of us had worked together at UniProt, and I even played bass in the company band. It’s a very incestuous world. People could leave UniProt and literally walk across the street and start working at SciBite. I flew to Cambridge, called them, went for a drink, and told them I needed a job. They connected me with their parent company, Elsevier, and surprisingly, they were happy to bring me on board and hear what I had to say.
And based on that experience, I now discreetly reach out to my network and ask, “Hey, we're working on this. Do you have any complaints you don’t feel comfortable sharing with marketing or sales but want to bring directly to the operations or product teams?” Such feedback has truly helped improve my taxonomy work for VOICE.
The power of networks! Now, can you explain the VOICE project and why it was needed?
The main issue is that taxonomies are how we capture and retrieve data, but they are mostly manually curated and maintained by subject matter experts.
This manual process leads to redundant efforts across multiple taxonomy teams at Elsevier. Like many large organizations, we faced common challenges: breaking down data silos, supporting taxonomy growth and improving data access. We had editors adding the same concept across six different taxonomies … “Why are we doing this?”
By consolidating all of Elsevier's taxonomies into a single management platform, we demonstrate that we practice what we preach: we do eat our own dog food. And SciBite's CENtree Ontology Manageropens in new tab/window was an ideal solution, an enterprise platform designed specifically for life sciences with flexible support for various content types, making it easy to combine sources. CENtree helps us eliminate data silos by mapping product taxonomies, and it enables us to compare how public taxonomies versus our internal ones model disease branches or how concepts may relate to each other.
The real power lies in application ontologies – when you build from a parent taxonomy, any updates to the parent with new synonyms or hierarchy changes automatically carry over to the application. This greatly simplifies taxonomy management.
Of course, another obvious reason for moving to CENtree was recognizing that it was wasteful to license ontology management systems when we own a company that builds them.
What have been the main challenges?
This is fundamentally a change management project. We’re transitioning Elsevier teams that have been embedded in their taxonomy workflows for as long as 20 years to a completely new system. Some still complain it doesn’t function exactly as they want, but we’re offering workarounds, and more improvements are on the way.
We’re also merging our entire taxonomy ecosystem into a single comprehensive taxonomy that we can then cut up as needed for different products. This approach has really resonated with the teams.
Overall, the team has been quite receptive. They're recognizing opportunities and showing creativity in finding solutions. The engineering team and taxonomy editors communicate daily, and newer team members in India and Greece are introducing automation ideas and improvements.
So, in a way, the project is also helping to break down not only data silos, but also departmental and regional ones. What's next for VOICE and taxonomy integration?
The next major step is mapping our concepts to public standards. We already map to MeSHopens in new tab/window from the National Institute of Health, but we aim to connect to other standards used by different companies and hospitals. This enables customers to take content retrieved through our taxonomy and translate it into their preferred language.
From a technical perspective, this means that if someone incorporates our content in their database, they don't need to modify or realign it – a mapping is already in place. This is especially important for RAG systems where you’re combining multiple data sources labeled with the same concept.
Meanwhile, we’ve already started internally by mapping between Embase, PharmaPendiumopens in new tab/window and other Elsevier products that don’t speak the same language. Next, we’ll expand to external standards, working with the SciBite team to create more automated mapping processes.
If you had everyone’s attention, what advice would you give them?
Sometimes we need to return to basics. I compare this to building a house – a house built on a weak foundation will collapse quickly, but if you lay the foundations properly and understand the problem you’re addressing, you can make the right choices.
And, if I really did have everyone’s attention, I would also say, ‘Please, just stop building silos. I’m tired of cleaning up the mess.” [Laughter]
What's your vision for the future of knowledge management at Elsevier?
Ultimately, VOICE should result in comprehensive, up-to-date taxonomies with the quality assured by Elsevier’s subject matter experts. These taxonomies should be produced so efficiently that we can respond quickly to new use cases or customer requests. We want to create a unified vision where our products integrate seamlessly rather than exist as isolated silos.
Our goal is to make our taxonomies more FAIR – Findable, Accessible, Interoperable, and Reusable – for customers, allowing them to connect our concepts with public standards. This modular approach provides flexibility while preserving the expert curation that adds real value.
We’re not just managing data; we’re enabling discovery and generating insights to accelerate scientific research and drug development. Our main goal is to create tools that help researchers find what they need quickly and accurately. Instead of struggling with information retrieval, they can focus on producing insights that will address the world’s problems.
And that’s why I now truly see the value of what we do: we are freeing information that was once locked up and getting it to the people who can use it.
Watch George’s webinar ‘The VOICE Project: The vision for modular taxonomy production at Elsevieropens in new tab/window’.
貢献者
