A career where intuition meets technology

A pioneer of web search and machine learning brings industry know-how to the world of academic research

Antonio Gullí, PhDAs VP of Product Management for Researcher Operating System and Awareness Technologies at Elsevier, Dr. Antonio Gullí (@antoniogulli) brings his industry expertise to the world of academic research. He has 20 years of experience in web search, machine learning and big data, heading development teams for Microsoft Bing and Ask.com; he founded and sold his own company; and he’s filed more than 20 patents.

Here, Antonio writes about his work at Elsevier, the qualities needed to succeed in this kind of work – and why Andy Warhol’s image of Marilyn Monroe graces his blog and Twitter profile.

An image of Andy Warhol's <em>Marilyn</em>Some years ago, I was in New York, and a friend and I went to MOMA. They had an Andy Warhol exhibit, and we saw his silkscreen of Marilyn Monroe with all the variations. I love this image.

At the time, we had been thinking about how to use machine learning to determine how different images are similar. This is not a trivial problem.

When you go on the Internet, there are many images that are very similar to each other. And sometimes that similarity is an indication of popularity. Think of all the similar images of President Obama, for example. Determining the popularity of people and things is actually an important aspect of search algorithms and machine learning.

So when we saw the painting of Marilyn, we had an idea of how to solve this problem. The idea worked really well, so we filed a patent for it.

Since then, I’ve loved this image; art and technology are completely different fields, but this image asks the intuition to make the connection. We saw this image, and then we made the connection of what we had to do.

The work I do is about making connections, sometimes in ways that are not obvious. There is the mathematical side of data and technology, but we need creativity and intuition to make that data meaningful.

Antonio created his own version of Andy Warhol’s <em>Marilyn</em> for his Twitter profile.I’ve filed many patents because I really like the process of creating things. That’s one of the reasons I joined Elsevier.

At Elsevier, people are trying to innovate, moving the needle. We have a lot of data – from articles, journals, books, patents, legal documents, all kinds of scientific publications. I find this really interesting. I already worked on web search, where you have a huge amount of data but not a lot of structure in the data. But with research articles and legal information, there is a lot of structure in the data, so I was curious to know what it would be like working with this data.

But ultimately, our work is not just about the data – it’s about the people who use the data.

The program I'm part of is called the Researcher Operating System (ROS). The idea is to put the researcher, the author, the reviewer – all the different roles our users are playing – at the center. What’s interesting is the social communities we can create for people in particular disciplines, like physics or genomics. Who are the most important authors and reviewers? Which topics are getting the most attention? So there is a network, and there are social activities between these communities.

My team’s role is to provide the big data infrastructure for supporting this type of data analysis. Our most important project is to collect information from many different types of sources. We have four different classes of information:

  1. Traditional content – articles, journals, legal information, books.
  2. Usage. What the users are actually doing with this content? How do they use it? For example, they can search this content, they can browse this content, they can interact with this content in different ways.
  3. The users themselves. Who are they? What is their professional profile? What they are doing with their research?
  4. The life of the article. When was this article reviewed? Who were the reviewers? When was it published?

Once you have this information, you can build services on the top of it. For instance, you can enable users to filter their search according to the relevance of the article. You can generate article recommendations based on search queries. You can enable users to set up automatic and alerts when new information on their topic becomes available. You can analyze queries to track emerging trends in science and medicine – such as when interest in the Ebola virus started to emerge.

We are also building metrics that allow universities and researchers to see how they are performing in specific areas of research – and to compare this performance to other institutions around the world. For example, you can understand how active Cambridge University is in genomics or computer science compared to other universities.

Also, we recently acquired Newsflo, and now these colleagues work on my team. They track media coverage and social media mentions of an institution or researcher to get a better idea of their popularity beyond traditional metrics like citations and Impact Factor.

We create all these services with machine learning. Machine learning is a kind of artificial intelligence – an automated process of learning about the characteristics of data. We don’t just program a computer to respond to a given prompt; we create algorithms that can “learn from experience” – like search engines that learn the popularity of terms from the data of repeated queries, or whether an article includes topics that are relevant to a particular journal.

For this reason, the field of data science actually requires a great deal of creativity. To do well with this kind of work, the most important thing is curiosity – to seek out the patterns on top of all this data – and how you can use these patterns. You need to have an intuition for the data and figure out how to use the data to generate new types of services. Like the example with Newsflo, using machine learning to figure out whether researchers are popular by going beyond the Impact Factor to search news articles and social media. Our people are always coming up with new ideas on how to use the data we have.

So people who work on these projects have to have a technical background as well as an intuitive sense of how to make the most of the data.

Also, nowadays, you need a combination of research skills and a “just-get-it-done” attitude. I’ve never been a big fan of research just for the sake of doing research. And at the same time, I’m not a fan of doing things without a strong analytical basis. It’s always the combination of these two things, and this is another reason I was interested in Elsevier. More and more, Elsevier is taking a modern approach to solving problems – making decisions based on clear evidence.

For instance, every time we launch a new product, we test it by putting it front of many, many users, and observing how those users are behaving with several different versions of the product. And this is the modern way of doing things. This way, we learn how the users are interacting with this product, and we iterate with this product based on what we learn from the user testing. We adapt the product to the user, not vice versa.

My life has always been about building great products, building teams and championing innovation to millions of users in multiple languages. Now, my career is about bringing this industry know-how to the world of research and academia.

Do you want to work at Elsevier?

We’re looking for data scientists and product managers with an interest in building data services for researchers. If you are interested in these kinds of positions at Elsevier, check out our Technology Careers page.

Elsevier Connect Contributor

Dr. Antonio Gullí (@antoniogulli) is VP of Product Management for Researcher Operating System and Awareness Technologies at Elsevier, where he brings his years of industry expertise to the world of academic research. He has 20 years of experience in web search, machine learning and big data.

Before joining Elsevier in Amsterdam last year, Antonio worked for Microsoft, where he led the Bing development team in London. He created algorithms to determine whether news articles were popular, suggest related articles, and enable users to refine their searches.

Previously, he served as CTO for Ask.com in Europe (now part of IAC), where he created a European Development Center, managing teams in the US and Europe. Before that, Antonio was the CEO of Ideare , one of the earliest search and pay-per-click advertising companies in Europe, which he co-founded and sold to Tiscali. Back in 1996, Antonio co-developed the first Italian search engine, Arianna, and he was the product owner of Web classification technologies at Fireball, the first German search engine.

Antonio earned his PhD in computer science from the University of Pisa, Italy. He has authored many articles for peer reviewed journals and filed more than 20 patents.

Antonio blogs at Antonio Gullí’s Coding Playground.

comments powered by Disqus