BCG (Bacille Calmette-Guérin), a time-honored tuberculosis vaccine, has been used worldwide for close to a century. Recent studies suggest it might also help boost immunity against the novel coronavirus, although this is by no means certain.
To investigate the potential value of BCG against COVID-19, Elsevier and data insight consultancy Estafet organized a hackathon, or AI Challenge, that populated the BCG World Atlas with much-needed up-to-date data.
Now a follow-up challenge is underway aimed at analyzing the data to provide new insights on the possible benefits of BCG in combating COVID-19, and to inform BCG-COVID-19 clinical trials. Data scientists are invited to participate for a chance to help curb COVID-19 and win a prize for doing so. Submissions are due by December 31, 2020.
Join the hackathon
Data scientists worldwide are invited to join the AI Challenge: BCG – COVID-19 — Find Insights that Could Help the Clinical Trials to explore links between the BCG vaccine and Covid-19 infection and mortality. The prize pool is $6,000 and the submission deadline is December 31, 2020.
Using machine learning and other technologies, Elsevier and Estafet hope to provide support for the role of BCG vaccinations or an alternative hypothesis. They are seeking answers to the following questions:
- Is BCG vaccination causally related to reduced COVID‐19 mortality?
- Are other factors, such as lockdown and average age of the population, responsible for the different mortality rates?
- If BCG vaccination reduces COVID-19 mortality, what are the key factors? For example:
- How long does the immunity engendered by BCG last after vaccination?
- Which BCG strain was used?
- What is the optimal time to vaccinate?
Answers to these questions will not only inform clinical trials but could provide information vital to public health globally.
Collaborating across disciplines to organize the hackathon
Estafet’s Technology Director Radoslav Kirkov had been following the news on potential correlations between BCG vaccination and COVID-19 and thought it was a project the company could tackle with machine learning. So he contacted Anita de Waard, Elsevier’s VP of Research Data Collaborations, to discuss a partnership. De Waard proposed a hackathon and reached out to several epidemiologists to gauge interest, including Dr. Madhukar Pai, Director of McGill University’s International TB Center and a founder of the BCG Atlas.
Dr. Pai was not sure a hackathon was a good idea at this point. He had warned there was not yet enough evidence that the vaccine could affect SARS-CoV-2 and believed a hackathon might risk overhyping it. Instead, he suggested a hackathon aimed at expanding the BCG Atlas, which had not undergone a significant update since its inception in 2011.
At Dr. Pai’s suggestion, De Waard contacted Assistant Professor Alice Zwerling of the School of Epidemiology and Public Health at the University of Ottawa, who now spearheads the atlas.
After conferring with other researchers, they decided the best solution was to do two hackathons:
- The task 1 AI Challenge populated the BCG Atlas with data,
- The task 2 AI Challenge will involve delving into the data to identify any meaningful correlations between BCG and COVID-19.
Elsevier took the lead in connecting data scientists with clinical researchers to promote participation.
“In these crazy times, everyone had the same idea: ‘Let’s make this happen,’” de Waard said in a webinar for participants to celebrate the task 1 winners. “We had people with different skills, different interests, and also complementary interests, and it all came together in a marvelous way.”
The task 1 hackathon, which closed in early September, yielded about 12 percent more data for the atlas.
Hackathon Task 1 process
- Created a BCG strain spreadsheet (there had been no unified name or ID for each strain)
- Gathered manually some BCG policy information
- Created a web scraper to search and extract that type of information automatically from the internet
- Created a tool to translate all non-English documents to English
- Extracted, cleared and unified all information related to COVID-19, including relevant information for each country
- Published the data and information on Kaggle
Winners were Dimitrina Zlatkova, a data scientist at Ocado Technology, Bulgaria, who generated an additional 57 entities for the atlas and was awarded $3,000, and Marouane Benmeida, a freelance programmer in Morocco, who generated 33 entities for the atlas and received $2,000.
“It’s been an extraordinary project in extraordinary times,” said Tim Miller, VP of Life Sciences Platform Solutions at Elsevier. “Watching this come together in Zoom chats and other unusual ways of meeting, seeing all of these people collaborate across disciplines, across skill sets, across geographies, has been tremendously rewarding. And it amazes me that in a relatively short period, over 100 Jupyter notebooks were produced to bring the data together so we can get value out of it.”
The notebooks are freely available and can be run at any time by any researchers.
A specific deliverable that had been suggested by Dr. Zwerling was information on the different BCG strains based on scientific papers demonstrating which may be stronger or weaker. This information was also needed for task 2, Dr. Kirkov said, so hackers could help identify which strains might be more important with respect to COVID-19.
Informing COVID-19 clinical trials
Currently, about 30 trials related to BCG and COVID-19 are underway, and the jury is still out regarding the vaccine’s potential. That’s what the second hackathon hopes to clarify. For task 2, ongoing until December 31 (See sidebar, “Hackathon underway: Exploring links between the BCG vaccine and COVID-19 infection/mortality), Elsevier and Estafet will be looking to get more data to help researchers currently working on or planning trials, according to Dr. Kirkov.
“We’re inviting some of the researchers who have authored the papers we gathered in task 1 to work with us now to see if we can come up with insights to inform further studies,” he said.
At the same time, the hackathon team invites anyone who wants to continue extracting data to help populate the BCG Atlas to do so, or to develop a way to automate the update process going forward, perhaps every year or two.
“Task 1 was a great reflection of how data science can be used to support clinical research,” de Waard said. “We know it’s not just about finding random datasets but rather applying that data science to solve specific questions — and how data scientists need to work hand-in-hand with researchers to find some meaningful answers.”
Regarding task 2, Dr. Kirkov said: “The more we can collect high quality data, the more chances we have of answering in a useful way key questions about BCG’s potential against COVID-19, as well as other important hypotheses.
“I had two requests recently from authors who are in the process of publication and want to reference the atlas in respected journals,” he added. “So clearly, this is an active area of research, with more publications anticipated now and in the near future. We want data scientists to join us in this quest.”
comments powered by Disqus