Modelling epidemics: the maths behind disease outbreaks


Epidemiology has come a long way since John Snow’s famed investigations into the causes of cholera in 19th-century London. It now forms the basis of modern public health and has progressed at a pace beyond that of many other fields. The incorporation of mathematical and computational methods into the study of disease processes is now routine.

This approach is particularly powerful when it comes to epidemics; infectious disease outbreaks that affect vast numbers of people and can spread rapidly. The past ten years alone have seen major outbreaks of swine flu, the Ebola and Zika viruses, and even a resurgence of plague in some regions.

Outbreaks such as these are likely to become more common with climate change, movement of people, and failing antibiotics. Mathematical modelling, which can predict disease progress and outcome as well as identify the potential causes of transmission and optimal interventions, is an increasingly important implement in the toolkit of modern epidemiology.

Despite this, the inner workings of these mathematical models remain a mystery to many, even those who rely on their results. Three recent studies published in the Elsevier open access journal Epidemics aim to change this and make the mechanics of disease modelling accessible to many more people.

The flu-tracking app

Now considered merely a nuisance to most, the flu can in fact be a deadly virus. In 1918, the Spanish flu pandemic, caused by a strain of the H1N1 influenza virus, killed an estimated 5% of the entire global population at the time.

To commemorate the 100th anniversary of the outbreak – one of the deadliest natural events in human history – the BBC commissioned a documentary and citizen science experiment to simulate the outbreak of a flu pandemic in 2018.

The documentary, Contagion! The BBC Four Pandemic, went alongside a smartphone app through which viewers could upload data on their movements and contact with other people throughout the day – what epidemiologists call ‘mixing patterns’.

The maths team behind the app, based at the University of Cambridge and the London School of Hygiene and Tropical Medicine, used this data to build and run a model of how a pandemic would spread across the country. They estimated that, in a worst-case scenario, over 40 million people could become infected and the death toll could be up to 886,000.

The app’s collection period ran until the end of December 2018 and the vast data set will soon be made available in full to the scientific community. This data set, the first on such a scale, will allow scientists around the globe to model the influence of human movement and contact patterns on the spread of infectious disease.

The results could be useful in preparing for future disease outbreaks. Just ten years ago the H1N1 flu virus, popularly known as swine flu, re-emerged to cause a pandemic, infecting up to 200 million people and causing global concern. The data collected by these researchers will be critical to improving understanding of how an infectious disease such as flu can spread.

Epidemix: online disease modelling

Mathematical models are not just a research tool. They are already used in public health, where they provide answers to vital questions: “How big will the outbreak be?”, “How will it develop over time?” and, perhaps most importantly, “How can we control it?”

However, to many, mathematical models are a black box. This can lead to negative perceptions of models as unrealistic, unhelpful, or confusing, which means they are sometimes disregarded over other, more traditional methods.

To overcome this problem, a team of researchers led by London's Royal Veterinary College recently created epidemix, an interactive, online application to allow non-specialists to visualise disease transmission and, in the process, improve their understanding of disease modelling.

The application can be customised for a number of different parameters, including the size of the population, the length of the infectious period, and the control strategy (vaccination or culling), offering instant results on how the infection would develop over time.

Although other user-friendly software applications exist, they have significant limitations. “They often require several hours to become familiar with their interface, which is an obstacle to the development of short courses or one-off practical sessions,” observes epidemix developer Dr Guillaume Fournié.

Fournié’s interface also offers eight different models of disease transmission, varied by mixing pattern. “For infectious diseases transmitted through contacts between hosts, the distribution of contacts which are likely to transmit an infection within a population defines the potential of this infection to spread, the speed and scale of the resulting outbreak and the type and intensity of control interventions needed to mitigate it,” he says.

Epidemix explains the key concepts and assumptions behind mathematical models of disease dynamics.

The application can be used by anyone but is especially aimed at policymakers, public health workers and researchers. It has already been used by many scientists, primarily for teaching purposes, and there are plans to continue developing the application. “We want to keep developing epidemix to expand the model library, to allow users to upload data/parameter sets and to include different presentations of simulation outputs,” says Fournié.

Real world use case: A model for Ebola

ebola epidemics infographic.png

The Ebola virus is one of the deadliest human viruses. Infection begins with flu-like symptoms, progressing to vomiting, diarrhoea, a rash and in many cases bleeding internally and from the ears, eyes, nose or mouth. On average, half of those infected with the Ebola virus die from their symptoms.

The biggest known outbreak of Ebola virus disease occurred in West Africa in 2014, eventually infecting over 28,000 people. The outbreak inspired a global forecasting challenge, in which several teams submitted forecasts of the outbreak to the US National Institutes of Health, using synthetic datasets under several different scenarios.

The teams were asked to submit forecasts at the local and country level, over the short- and long-term, and to provide estimates at several different timepoints along the development of the simulated epidemic.

One team in this challenge, based at Virginia Tech’s Biocomplexity Institute (and now at the University of Virginia), built an agent-based model during the 2014 outbreak, with the express purpose of helping policymakers.

Lead author Dr Srini Venkatramanan explained how the model works: “In an agent-based model, the evolution of the system state is modelled by encoding the actions and interactions of individual agents. In the context of infectious diseases, the evolution of disease spread can be modelled via the activity patterns and resulting social contacts of individuals, which in turn lead to the spread of infection.”

Alternative approaches to disease forecasting have relied on building trends from lots of data, from the epidemic time series and even social media. By extracting patterns from the data, these models can forecast short-term or seasonal trends in infectious disease. However, what they cannot do is propose interventions or answer ‘what if’ questions.

Unlike these purely data-driven methods, Venkatramanan's model accounts for social contact and disease dynamics, allowing users to choose the best intervention, and much more.

“A detailed agent-based model like ours can be used to answer questions such as the optimal placement of treatment centres and allocation of scarce resources, such as vaccines,” he says.

The model could be adapted to other infectious disease outbreaks. In fact, the team have already applied it to outbreaks of seasonal flu and Zika. And looking ahead, the model could account for more complex factors, becoming even more realistic.

“We are working on adding data pertaining to environment (such as weather) and capturing disruptions to activity patterns and mobility due to extraneous processes (such as conflicts and natural disasters)," Venkatramanan explains.

An infectious way of teaching

teaching epidemics infographic.png

To prepare future epidemiologists for the world of mathematical modelling, researchers at Imperial College London developed a training package to teach their MSc epidemiology students about disease outbreaks.

The package builds on an earlier training exercise developed through the International Clinics on Infectious Disease Dynamics and Data Program (ICI3D)1, which pioneered a new form of ‘outbreak exercise’.

Outbreak exercises are often used to teach the principles of infectious disease epidemiology, such as the non-linear processes underlying chains of infection events. In the new, interactive format developed by ICI3D, the students themselves become part of the outbreak. “They feel more involved and engaged, which can improve their understanding,” explains Oliver Watson of the MRC Centre for Global Infectious Disease Analysis (MRC GIDA).

Watson, Íde Cremin and other researchers at MRC GIDA extended this exercise to include new concepts and network analysis, and developed a package in the R programming language that students can use to analyse the data. “R is increasingly being used across biological sciences and can be regarded as one of the most useful programming languages used by epidemiologists,” explains Watson.

In practice, the exercise is very easy to implement. “The paper-based outbreak started on a Monday, when three students from a class of MSc students were given an infection form explaining that they had been 'infected'. They were then instructed to work out how many people they would go on to infect,” describes Watson. Information on who they infected and at what time was stored in a database, growing as the fictional outbreak spread.

The following week, the students assessed the outbreak and analysed the data in R for key parameters, such as the basic reproduction number or R0 – the number of new cases of infection one case generates. When one person has measles, for example, they can infect up to 18 others. Flu typically only spreads to two or three others. “This process is similar to how field epidemiologists responding to an outbreak would use case reports to assess its potential for spread,” says Watson.

Although designed for MSc students, they say their R package could be extended to anyone who wants to understand disease outbreaks, including senior academics and policymakers. It was presented at the Epidemics conference in 2017, receiving interest from other institutions that wanted to incorporate it into their own teaching.

Simulations to save lives

Beyond training, the power of epidemic modelling is huge. “Modelling enables us to understand the link between the biological processes that underpin transmission events and the population-level dynamics of the disease. In particular, a well parameterised mathematical model allows us to test the feasibility and effectiveness of multiple intervention approaches, without having to apply them in the real world first,” says Watson. This saves time, money, and lives.

Modelling is already a key component of health policy, but this is just the beginning. “Modelling is likely to become more prominent in epidemiological research in the future,” Dr Fournié adds.

To help propel this, both researchers extolled the virtues of open access publishing.

In the case of epidemix, publishing in Elsevier's open access journal Epidemics meant all potential users could have access to an in-depth description of the app. Likewise, Dr Cremin and Mr Watson said there was a “massive” benefit of publishing his teaching exercise in this open access journal.

"Our developed practical was adapted from a published pedagogical exercise. This was made possible because the authors of the exercise chose to publish in an open access journal and to share their full teaching exercise online. Making the teaching tools and the research paper freely available is key to helping ensure it can reach a wide audience."

Finally, Dr Venkatramanan, developer of the Ebola model highlighted in box 1, speaks to the particular importance of open access publishing for fast moving disease outbreaks.

"Open access journals allow for immediate and easy dissemination of research findings, which are especially crucial when dealing with an emerging outbreak. Further, such journals provide increased visibility for the work, leading to wider engagement and future collaborations."

1 The exercise originated at the Clinic on Meaningful Modeling of Epidemiological Data (MMED), a 2-week modelling clinic hosted by ICI3D that emphasises the use of data in understanding infectious disease dynamics.

Read the collection

Read more about the studies mentioned and explore others in the new collection. All of the papers are open access.