At Elsevier, we aim to help researchers make new discoveries, collaborate with their colleagues, and give them the knowledge they need to find funding. That means developing a deep understanding of researchers’ work so we can create tools that help them find what they need when they need it.
Our recent collaboration with Lancaster University shows how network analysis can help us understand how researchers work and what they need.
Web analytics – the measurement and analysis of the sequence of hyperlinks a user clicks as they navigate through a website – is an essential tool for any digital company looking to understand and optimize their user experience. Elsevier already uses this for improvement and personalization of features, and you see it at work in our personalized recommender systems on Mendeley and ScienceDirect.
There’s also a great opportunity in joining up our data sources across platforms. Since Elsevier offers services and tools that touch many points of the research lifecycle – from exploring funding opportunities in Mendeley, searching for collaborators in Scopus and accessing content through ScienceDirect to submitting manuscripts for publication via our editorial systems – we have built a strong understanding of researchers and their workflows.
Given the large number of researchers using Elsevier products, coupled with the vast corpus of content they are able to engage with (millions of articles from over 3,800 journals and more than 35,000 books), we apply data science to identify common patterns of user engagement. This enables us to find commonalities and differences in typical research practices across geographies, domains, institutions and researcher roles, with the aim of further personalizing user experience across our platforms. And it helps us provide researchers with the right tools at the right time.
One project that emerged from our work in this area is a PhD Studentship that began last year at STOR-i, an EPSRC Centre for Doctoral Training at Lancaster University focused on Statistics and Operational Research.
PhD student George Bolt is looking into developing and applying methods from network analysis to help us make sense of our high dimensional complex usage data. George is supervised by Dr. Simón Lunagómez, Lecturer in Statistical Modelling for Networks and Structured Data, and Dr. Christopher Nemeth, Lecturer in Statistical Learning at STOR-i, alongside myself and my colleague Jacek Szejda, Senior Data Scientist for Elsevier Research Products. In January, George presented his initial work at the STOR-i Annual Conference at Lancaster University.
At the event, I caught up with George and his supervisors to find out how the project is shaping up.
Harriet Muncey (HM): What excites you about this project?
Simón Lunagómez: The core philosophy of the STOR-i CDT (Centre for Doctoral Training) is producing research excellence with impact. One of the most exciting prospects of this project with Elsevier is its potential to fulfill this goal. Not only is there an opportunity for the development of novel statistical approaches, but also the potential for impact via informative insights on platform usage which can contribute to the development of the user experience.
George Bolt (GB): The opportunity for me to develop novel approaches and apply these to real datasets, knowing that any interesting results have the potential to benefit Elsevier and contribute in some form to the improvement of their platforms.
HM: What value does an industry collaboration bring versus a standard PhD Project?
Chris Nemeth (CN): The value of standard PhD project lies in its intellectual contributions to the wider literature. While this is similarly true for a PhD with an industrial collaborator, it also comes with a few extra perks. On the industrial side of the partnership, there is potential for direct value from insights gained through the research. Whilst on the academic side, not only does the student get additional support from industry supervisors with domain expertise and gain valuable experience of conducting research with solid applications, but there is also the intellectual challenge brought to the academics by the industrial partner, which is invaluable in the processes of asking new and meaningful methodological question.
HM: What interested you about working with Elsevier?
SL: There seemed an eagerness at Elsevier to use the latest ideas and approaches in their pursuit of platform improvements. This desire to stay at the cutting edge makes academic collaboration not only natural but also highly enjoyable, with enthusiasm coming from both sides in equal measure. It was this congruence which made the prospect of collaboration particularly interesting.
GB: The prospect of working with members of an industrial data science team who are working every day on interesting problems.
George’s project so far has focused on how to define a model that represents researchers’ behavior using statistical network methods. Network-based algorithms are extremely useful for modelling complex interactions between entities and are a natural way to represent how users navigate between different web pages:
By fitting a network model to a user’s interactions, it is possible estimate the model parameters and make statistical inferences about the probabilities of particular pages or nodes of the network. These inferences can then be compared using distance metrics in order to measure the similarity between different researcher behaviors.
“This project is exciting,” Jacek said, “because it addresses the key problem we’re trying to crack in our team: how to effectively model the way attention of our users is distributed over products and time.”
Which can roughly be translated to “how to effectively model the way researchers’ time is spent on different parts of the research workflow.” Building an interpretable model will mean gaining a much deeper insight in to research activities, how they are similar and different from group to group, and how we can better adapt, personalize and improve our products at Elsevier to support research across the world.
comments powered by Disqus