Connect

Filling in the blanks of knowledge graphs

25 May 2021

By Alison Bert, DMA

Researchers are finding innovative ways to make knowledge graphs more effective, in a collaboration between the Discovery Lab and the UCL Centre for AI

Caption: Knowledge graphs can be used to find information on any topic. These knowledge graph queries seek information on movies and geographies. They were created by Erik Arakelyan, Daniel Daza, Pasquale Minervini and Michael Cochez (pictured above from top to bottom) for their paper “Complex Query Answering with Neural Link Predictors,” ICLR 2021.

If you’re a researcher, you may use knowledge graphs to navigate the massive amount of research available in your field and beyond. These models represent relationships between objects in a way machines can process, yielding far more relevant information than you could find by searching the literature on your own with keywords.

But what happens if information is missing from the knowledge graph? Can you be as certain of the results you retrieve?

Because this is a common occurrence, researchers have developed a framework to make knowledge graphs more effective when information is missing: two members of the Discovery Lab in Amsterdam collaborated with two researchers at the UCL Centre for Artificial Intelligence. Their paper — Complex Query Answering with Neural Link Predictors — received an ICLR 2021 Outstanding Paper Award. They presented it May 6 at ICLR 2021, the Ninth International Conference on Learning Representations.

While the problem of finding gaps in knowledge graphs has been addressed in other studies, these researchers have proposed a model that would be more efficient and useful to researchers. As co-author Daniel Daza, a PhD student studying machine learning at Vrije Universiteit (VU) Amsterdam and the University of Amsterdam, explained:

Previous methods require a large amount of training data and computing resources. With our method, we managed to reduce that by about 60 times. We are using 60 times less training data, and the model is much more efficient to train.

A second advantage of the new method is its enhanced interpretability. Daniel compared the current method to entering your question into a “black box” that outputs the answer:

It’s hard to know the reason why it might have come to this answer, whereas here, we are proposing a method that enables us to look at specific parts of the query answering process – to basically inspect why the model is producing a particular answer. This might be useful for a researcher who is working with a model like this to answer questions on a specific domain.

Complex Query Answering with Neural Link Predictors was published as a conference paper for ICLR 2021. The authors are Erik Arakelyan, now a machine learning engineer at ARM in Cambridge, UK; Daniel Daza, a PhD student in machine learning at Vrije Universiteit (VU) Amsterdam and the University of Amsterdam (UvA); Dr Pasquale Minervini, Senior Research Fellow in the University College London NLP Group; and Dr Michael Cochez, Discovery Lab Manager and Assistant Professor in the Knowledge Representation and Reasoning group at VU Amsterdam.

Read their paper(opens in new tab/window)

Applying Discovery Lab innovations to research products

With a mission to “drive scientific discovery using machine intelligence,” the Discovery Lab is a collaboration between Elsevier, the Vrije Universiteit (VU) Amsterdam and the University of Amsterdam. Researchers are developing technology, infrastructure and methods to address challenges scientists face in a world where information is multiplying ever more rapidly. Often, as in this knowledge graph project, they collaborate with researchers at other institutions.

The goal behind their lab is to integrate their work into Elsevier’s tools and platforms to help researchers to search the literature and interpret data more effectively.

As Dr Georgios Tsatsaronis, VP for Data Science and Research Content Operations at Elsevier, explained:

We created this lab to help build novel technologies we can apply to our data to help our users obtain better insights from the research literature and for the research they are doing.

As Lab Manager, Assistant Prof Michael Cochez works with two postdoctoral researchers and three PhD students. After doing the fundamental research, he said, they work with Elsevier colleagues, who apply their technologies to research products:

They take the basic research we do and then put the best in their production system. We work with the Elsevier engineers, and together set (the technology) up so that it can function there as well.

One product they have worked on together is Entellect — a life sciences data integration platform that brings together the researcher’s data with Elsevier’s data and data from researchers around the world.

Their projects fall into five workstreams:

A multi-modal recommendation service driven by reinforcement learning over structured multi-modal information.
A question-answering service to give actionable answers in the context of research tasks.
A proactive research hypothesis generator to drive promising new research.
A query-answering service that can answer natural language questions through knowledge-driven query construction.
A knowledge graph updating service that feeds in real-time on a stream of scientific literature.

The ultimate goal is to develop a data platform that provides actionable knowledge to researchers and other professionals. To this end, they are developing intelligent services that enable a science knowledge graph to power an integrated research knowledge platform for Life Sciences, Health and other research.

The Discovery Lab is part of the Innovation Center for Artificial Intelligence (ICAI) — an initiative created by academic researchers to support AI innovation and talent development in the Netherlands. These innovation hubs represent a growing trend of collaboration between academia and the private sector.

The value of collaborating with academic researchers

As VP of Research Collaborations at Elsevier, Anita de Waard leads the development of innovative research product concepts by forging relationships with academic researchers, who work on projects with Elsevier’s computer and data scientists. Her passion for this approach stems from her background as a physics researcher and doing PhD research in linguistics.

Anita said the formation of her Research Collaboration unit in 2018 reflects an important transformation in how Elsevier develops research products:

We work closely with researchers because they are our users and also the editors and authors of our content, as well as the people who develop new technologies to improve access to and integration of scholarly knowledge.
Working with the Discovery Lab is a win-win: they get access to ‘real-life’ data and users, and we are able to work with key researchers in knowledge representation to improve our products.

This close collaboration leads to a deep understanding of the challenges researchers encounter and what they need to be successful. Through this understanding, Anita said, meaningful innovation can take place:

Increasingly, our most important focus is to support innovation. And to me, that's the most interesting and the most exciting goal.

Contributor