Connect

Meet the Research Integrity Experts: Savvas Chamezopoulos

July 22, 2025 | 9 min read

By Liana Cafolla

shulz via GettyImages

Data Scientist Savvas Chamezopoulos found his niche in the Research Content Data Science team where technology and people work side by side.

When data scientist Savvas Chamezopoulos was still a student at the University of Amsterdam, he already knew the environment he wanted to work in: a place that combined data technology and teamwork. When he joined the Research Content Data Science team in 2022, he found exactly that.

“I wanted something that includes technical stuff like coding, building and testing, but also dealing with people,” Savvas said. “And data science – at least at Elsevier – really offers this.”

His workday usually involves a constant back-and-forth of consultations and experimentation. A typical project for Savvas’ team often starts with a request from a nontechnical colleague or team to solve a problem or improve a process. From there, ideas and conversations start to flow within the data science team and beyond.

Savvas Chamezopoulos, Elsevier Data Scientist

Initial discussions focus on aligning the team around the scope of the project and agreeing on an ideal course. Then the work becomes hands-on. “The first step in building, let’s say, an AI-powered tool is to set up a quick experiment and get a feel for its potential,” Savvas said.

After more discussions, testing and refining, the next steps could be an exploratory data analysis, a deep dive into a language learning model, or fine-tuning an early model. Once the proof of concept has been achieved, the project is handed over to the engineers who will develop and build it.

“The most exciting part of the job is the diversity of the task and the skills that I need to use to get the job done,” Savvas said. “I mean, this is why I picked data science in the first place.”

Data: More Than Just Numbers

The data used by the team does not consist only of reams of numbers. Often, the most valuable data is the knowledge shared by colleagues.

“Sharing knowledge and sharing the experience is also very meaningful data, which can really shape the right course for a project,” Savvas said. “Some of the most important pieces of data, of information, that I’ve used in my work came straight from the publishing ethics experts, by them relaying their experience to me and sharing the patterns they found in their work. For example, we see that one very typical case of misconduct in the research integrity area is when authors are added after a paper was accepted. We wouldn’t be able to define this as a research integrity signal unless we had talked with the publishing ethics experts.”

Savvas Chamezopoulos looking to the side

The team uses data as the foundation for the tools they build to defend research integrity, but humans make the decisions and have the last word.

“The publishing ethics team makes all the decisions, but we provide them with the tools to provide them with evidence, enabling them to make more informed decisions and to also discover cases that they might have missed,” Savvas said.

Selecting the Right Tool for the Job

Data is just one of many tools in the Elsevier toolbox. Savvas points to a massive and growing arsenal of diverse and complementary experts, programs and partners, including data scientists, engineers and subject matter experts.

While AI-powered tools offer the most powerful technology, that doesn’t mean they are always the most appropriate means of handling the task at hand.

“I don’t need a super strong drill, for example, to punch a hole in wood – sometimes a simple hammer and a nail is enough,” Savvas said. “We can build and we are building very, very strong tools that provide very, very concrete evidence on misconduct or lack of it by simply using data, simple aggregations and processes where no machine learning is involved. It’s very important to always choose the approach that does the job in the most efficient way.”

Slow and Careful Processes

Investigations are conducted using a combination of advanced technology and the shared expertise and patient dedication of humans.

In many cases, the team is developing and refining a growing number of signals to detect potential misconduct. As cases become more complex, they are increasingly finding multiple indicators that warrant further investigation. “The tools we’ve built act like a lead generator, pointing our team in the right direction,” Savvas said. “But it’s the human validation and verification feedback loop that really enables us to continuously improve these signals and ensure we’re catching what matters most.”

Meticulous investigations to mean that long processes are the norm. In this case, the process took almost a year to yield definitive results. “The ethics team have to follow the protocol of contacting the authors, gathering evidence and building a case,” Savvas said. “It’s not a binary decision. And especially when it comes to such volumes of paper, we have to look into each and every one individually. Some valid work might still be there.”

Savvas’ advice for researchers is to check the data they plan to use as thoroughly as possible:

“Always validate where you got the data from before putting it into a machine learning model. If the data is bad, then the result will be bad. Always ask your peers to cross-validate, to check if they know something about the data you may have missed. Also, more is not always more: Quality is more important than quantity.”

Related resources

Contributor