Semantic Web Challenge article collection
To celebrate the 15th birthday of the Journal of Web Semantics we would like to present this article collection showcasing a decade of practical applicability and social impact of knowledge based AI research on the web by highlighting the journal’s continued support of the prestigious Semantic Web Challenge (SWC)
Many thanks to all the SWC organizers, judges and of course the many contestants who have made the SWC possible over the years , and we look forward to an Exiting 2018 Semantic Web Challenge held at the 17th International Semantic Web conference held October 8 – 12 in Monterey , California , USA
All articles are freely available online until 1st January 2020.
Happy reading from Elsevier Computer Science , and don’t forget to check out the latest research at the Journal of Web Semantics FIRST LOOK , available at SSRN.
The 2017 Semantic Web Challenge (SWC) was organized by Dan Bennett (Thomson Reuters), Prof. Dr. Axel Ngonga (University of Paderborn) and Prof. Dr. Heiko Paulheim (University of Mannheim) and the finales were held at the 16th International Semantic Web Conference held in Vienna , Austria from October 11–15
In 2017, the Semantic Web Challenge - the longest-running competition in the area - introduced a new format to measure scientific progress in the field .By now, concepts such as “Big Data” and “Knowledge Graphs” need little further explanation, so to measure and evaluate real progress in this field, competing teams were asked in 2017 to perform two important knowledge engineering tasks on the web:
- Fact extraction (knowledge graph population)
- Fact checking (knowledge graph validation)
To enhance reproducibility, competing teams contributions were no longer judged by a panel of SWC experts, but each team's performance was measured against each other and against the current state of the art, based on a FAIR benchmarking platform.
The winner of the 2017 semantic web challenge was IBM Socrates by Michael Glass, Nandanda Mihindukulasooriya, Oktie Hassanzadeh, and Alfio Gliozzo of IBM Research AI.
"Socrates won both tasks by using an innovative integration of additional Artificial Intelligence techniques such as Natural Language Processing (NLP) and Deep Learning over multiple web sources to find and check facts. Their knowledge graph outperformed the state of the art as set by the baseline," said SWC Chairs Bennett, Dr. Ngonga and Dr. Paulheim.
The 2015 Semantic Web Challenge was organized by Sean Bechhofer of the University of Manchester, UK , and took place at the 15th International Semantic Web Conference held in Bethlehem, Pennsylvania, from October 11–15, and required applications to provide practical value to web users or domain experts, while making use of heterogeneous information sources under diverse ownership or control, whereby the meaning of data should play a central role.
The winner of the 2015 SWC open track was “3cixty: Building Comprehensive Knowledge Bases For City Exploration” by a large team from 11 institutions, led by Raphaël Troncy of EUROCOM.
Their application provides users who are planning a visit to a city comprehensive and contextual access to a very wide range of diverse metropolitan data sources and websites, and was tested at the international EXPO 2015 in MILANO
Two teams tied for second place:
“Towards a Sales Assistant using a Product Knowledge Graph” by Haklae Kim describes a system that helps users navigate the complex world of product specifications for electronic products.
- Kim, H., 2017. Towards a sales assistant using a product knowledge graph. Web Semantics: Science, Services and Agents on the World Wide Web, 46, pp.14-19.
The xLime system by Lei Zhang, Andreas Thalhammer, Achim Rettinger, Michael Färber, Aditya Mogadala and Ronald Denaux from AIFB provides multi-lingual search across different media channels.
- Zhang, L., Thalhammer, A., Rettinger, A., Färber, M., Mogadala, A. and Denaux, R., 2017. The xLiMe system: Cross-lingual and cross-modal semantic annotation, search and recommendation over live-TV, news and social media streams. Web Semantics: Science, Services and Agents on the World Wide Web, 46, pp.20-30.
The 2014 Semantic Web Challenge was organized by Sean Bechhofer of the University of Manchester, UK, and Andreas Harth of the Karlsruhe Institute of Technology, Germany and the finals tool place at the 14th International Semantic Web Conference held in Riva del Garda, Italy, from 19–23 October, 2014. As in previous years, the challenge required that applications had to provide a practical value to web users or domain experts. Systems should also make use of heterogeneous information sources under diverse ownership or control, and the meaning of data should play a central role.
The winner of the 2014 Semantic Web Challenge , Open Track was “Mining the Web of Linked Data with RapidMiner” by Petar Ristoski, Christian Bizer and Heiko Paulheim of the Data and Web Science Group at the University of Mannheim, Germany. The application provides an extension to the RapidMiner system that allows data analysts to exploit Linked Data without the need to be familiar with the details of Semantic Web Technologies.
2nd prize open track : Enabling Live Exploration on The Graph of Things by Danh Le Phuoc, Hoan Nguyen Mau Quoc, Hung Ngo Quoc, Tuan Tran Nhat and Manfred Hauswirth from the INSIGHT Centre for Data Analytics, National University of Ireland, Galway – This application pulls together real time data from the Internet of Things, visualizing the resulting knowledge graph in a dashboard.
3rd prize open track : DIVE into the Event-Based Browsing of Linked Historical Media by Victor de Boer, Johan Oomen, Oana Inel, Lora Aroyo, Elco van Staveren, Werner Helmich and Dennis de Beurs from Netherlands Institute for Sound and Vision, VU University Amsterdam, the Dutch National Library and Frontwise – This entry unlocks radio and television archives, providing innovative access through a well-designed and attractive user interface.
- De Boer, V., Oomen, J., Inel, O., Aroyo, L., Van Staveren, E., Helmich, W. and De Beurs, D., 2015. DIVE into the event-based browsing of linked historical media. Web Semantics: Science, Services and Agents on the World Wide Web, 35, pp.152-158.
The winner of the Semantic Web Challenge 2014 Big Data Track was another team from the Data and Web Science Group at the University of Mannheim, comprising Oliver Lehmberg, Dominique Ritze, Petar Ristoski, Kai Eckert, Heiko Paulheim and Christian Bizer, for “Extending Tables with Data from over a Million Websites”. Their system allows a user to search for and integrate data from a wide variety of tabular formats on an impressive scale.
The 2013 Semantic Web Challenge was organized by Sean Bechhofer of the University of Manchester, UK, and Andreas Harth of the Karlsruhe Institute of Technology, Germany and took place at the 13th International Semantic Web Conference held in Sydney, Australia, from 23 to 25 October, 2013.
As in previous years, the challenge required that applications had to provide a practical value to web users or domain experts. Systems should also make use of heterogeneous information sources under diverse ownership or control, and the meaning of data should play a central role. To better communicate research on “Big Data” the Billion Triples Track was revamped as the “Big Data Prize’
In 2013, there were five systems selected as winners, four in the Open Track, with a special Big Data Prize being awarded to the best systems making use of large-scale data sets.
The winner of the 2013 SWC open track was “The BBC World Service Archive Prototype” by Yves Raimond and Tristan Ferne of the BBC. The system used a combination of audio processing, crowdsourcing, analytics and visualization to allow the BBC to link and access archive audio material with, and during live broadcasts in real time.
In second place was “Constitute: The World’s Constitutions to Read, Search and Compare” by Zachary Elkins, Tom Ginsburg, James Melton, Robert Shaffer, Juan F. Sequeda and Daniel Miranker from the University of Texas at Austin, The University of Chicago and University College London.
Each year, on average around five nations draft a brand new constitution. Constitute provides a curated solution allowing people to explore relevant and specific aspects of all the world’s constitutions, providing a valuable knowledge resource to people tasked with drafting a new constitution for their nation.
- Elkins, Z., Ginsburg, T., Melton, J., Shaffer, R., Sequeda, J.F. and Miranker, D.P., 2014. Constitute: The world’s constitutions to read, search, and compare. Web Semantics: Science, Services and Agents on the World Wide Web, 27, pp.10-18.
Third place was jointly awarded to:
“B-hist: Entity-Centric Search over Personal Web Browsing History” by Michele Catasta, Alberto Tonon, Vincent Pasquier, Gianluca Demartini, Philippe Cudré-Mauroux and Karl Aberer of EPFL and University of Fribourg.
- Catasta, M., Tonon, A., Demartini, G., Ranvier, J.E., Aberer, K. and Cudré-Mauroux, P., 2014. B-hist: Entity-centric search over personal web browsing history. Web Semantics: Science, Services and Agents on the World Wide Web, 27, pp.19-25.
“STAR-CITY: Semantic Traffic Analytics and Reasoning for CITY” from Freddy Lecue, Simone Tallevi-Diotallevi, Jer Hayes, Robert Tucker, Veli Bicer, Marco Luca Sbodio and Pierpaolo Tommasi of IBM Research’s Smart Cities Team.
STAR-CITY used Semantic Web technologies to enhance analysis and presentation of heterogeneous real time and historical data to better monitor traffic conditions; a prototype experimented with traffic in Dublin, Ireland, and was also tested in Bologna Italy, Miami, FL, USA and Rio de Janairo in Brazil.
- Lécué, F., Tallevi-Diotallevi, S., Hayes, J., Tucker, R., Bicer, V., Sbodio, M. and Tommasi, P., 2014. Smart traffic analytics in the semantic web with STAR-CITY: Scenarios, system and lessons learned in Dublin City. Web Semantics: Science, Services and Agents on the World Wide Web, 27, pp.26-33.
The 2013 SWC Big Data Prize went to Muhammad Saleem, Maulik R. Kamdar, Aftab Iqbal, Shanmukha Sampath, Helena F. Deus and Axel-Cyrille Ngonga Ngomo of Universität Leipzig, NUI Galway and Foundation Medicine Inc. for “Fostering Serendipity through Big Linked Data”.
Their system linked publications from PubMed with a significant subset of the Linked Cancer Genome Atlas.
Organized by Andreas Harth and Diana Maynard, the 2012 Semantic Web Challenge took place at the 11th International Semantic Web Conference held 13–15 November in Boston, USA, and consisted of two tracks: the Open Track and the Billion Triples Track.
The winner of the 2012 SWC Open Track was “Event Media” by Houd Khrouf, Vuk Milicic and Raphael Troncy from EURECOM, Sophia Antipolis, France.
This system demonstrates how to use Semantic Web technology to more efficiently and easily integrate multiple online and social media content sources that evolve over time.
The runner-up of the Open Track was “Semantic Processing of Urban Data” (SPUD) by S. Kotoulas, V. Lopez, R Lloyd, M. Sbodid, F. Lecue, M. Stephenson, E. Daly, V. Bicer, A. Gkoulalas-Divanis, G. Di Lorenzo, A. Schumann and P. Aonghusa from IBM Research’s Smart Cities Team.
The mayor of Dublin wanted to know why his ambulances were perennially late, so this team integrated and analysed knowledge from hundreds of information sources emanating from the city, ranging from usual twitter feeds to garbage collection tags and many more, to help the mayor improve smart city service.
- Kotoulas, S., Lopez, V., Lloyd, R., Sbodio, M.L., Lecue, F., Stephenson, M., Daly, E., Bicer, V., Gkoulalas-Divanis, A., Di Lorenzo, G. and Schumann, A., 2014. SPUD—Semantic processing of urban data. Web Semantics: Science, Services and Agents on the World Wide Web, 24, pp.11-17.
The two systems that won third place jointly in the Open Track:
“Wildfire Monitoring” by K. Kyzirakos, M. Karpathiotakis, G. Garbis, C. Nikoladu, K. Bereta, I Papatousis, T. Herekakis, D. Michail, M. Koubarakis and C. Kontoes from the National and Kapodistrian University of Athens, National Observatory of Athens and the Harokopeio University of Athens
The Wildfire Monitoring system combines multimedia satellite images with ontologies and Linked Geospatial Data to improve the wildfire monitoring service used by the Greek civil protection agencies, military, and firefighters.
- Kyzirakos, K., Karpathiotakis, M., Garbis, G., Nikolaou, C., Bereta, K., Papoutsis, I., Herekakis, T., Michail, D., Koubarakis, M. and Kontoes, C., 2014. Wildfire monitoring using satellite images, ontologies and linked geospatial data. Web Semantics: Science, Services and Agents on the World Wide Web, 24, pp.18-26.
And “Open Self-Medication” by Olivier Cure of Universite Paris-Est, LIGM. This advanced system advises end users on self-medication, using the Linked Open Data cloud to mine contraindications for various over the counter medications and add to these where they were missing. A mobile geo-location price comparison tool enables users to find nearby pharmacies that sell the cheapest drugs, enabling French health care insurance companies to reduce their costs by nudging consumers though AI enabled web applications.
- Curé, O., 2014. On the design of a Self-Medication Web application built on Linked Open Data. Web Semantics: Science, Services and Agents on the World Wide Web, 24, pp.27-32.
The 2012 winner of the SWC Billion Triples Track was “Exploring the Linked Data Cloud” by X. Zhang, D. Song, S.Priya, Z. Daniels, K. Reynolds and J. Heflin of Lehigh University.
This tool allows for users to understand how massive data sets are populated and reveals patterns in within these data sets.
The 2011 Semantic Web Challenge was organized by Christian Bizer, (then at Freie Universität Berlin, Berlin, Germany) and Diana Maynard (then at the University of Sheffield, UK) and consisted of two tracks: the Open Track and the Billion Triples Track. The final round took place at the 10th International Semantic Web Conference held in Bonn, Germany, from 23–27 October 2011
The winner of the 2011 SWC Open Track was BOTTARI, an augmented reality mobile application to deliver personalized and location-based recommendations by the continuous analysis of social media streams.
The application was developed by the multidisciplinary team of Irene Celino, Daniele Dell’Aglio, Emanuele Della Valle, Marco Balduini, Yi Huang, Tony Lee, Seon-Ho Kim, and Volker Tresp from Politecnico of Milano, Italy, SIEMENS Corporate Technology in Munich, Germany and the Korean Saltlux.
The application combines local views with emotive analysis of live streaming twitter and blog posts to give the end user a novel augmented reality of the local Korean dining scene.
- Balduini, M., Celino, I., Dell’Aglio, D., Della Valle, E., Huang, Y., Lee, T., Kim, S.H. and Tresp, V., 2012. BOTTARI: An augmented reality mobile application to deliver personalized and location-based recommendations by continuous analysis of social media streams. Web Semantics: Science, Services and Agents on the World Wide Web, 16, pp.33-41.
The winner of the 2011 SWC Billion Triples Track was SchemEX–Efficient Construction of a Data Catalogue by Stream-based Indexing of Linked Data developed the Institute for Web Science and Technologies (WeST) at the University of Koblenz-Landau in Germany.
They were awarded the prize for the design of a real-time index generation algorithm which they successfully applied to build a concise lookup index for the Billion Triples data set.
For their foreshadowing of the upcoming internet of things, the judges gave an additional honourable mention to A Middleware Framework for Scalable Management of Linked Streams by Danh Le-Phuoc, Hoan Nguyen Mau Quoc, Josiane Xavier Parreira, and Manfred Hauswirth of the Digital Enterprise Research Institute (DERI) at the National University Ireland Galway.
The middleware framework maps a very large number of streaming real-world sensor data into a dynamic RDF model and provides an integrated live view on the data.
The 2010 Semantic Web Challenge was organized by Christian Bizer, (then at Freie Universität Berlin, Berlin, Germany) and Diana Maynard (then at the University of Sheffield, UK) and the finals took place at the 9th International Semantic Web Conference held in Shanghai, China from 7 to 11 November, 2010. As in previous years, the challenge consisted of two tracks: the Open Track and the Billion Triples Track.
The winners of the 2010 SWC Open Track were the team from Stanford University, comprising of Clement Jonquet, PaeaLePendu, Sean M. Falconer, AdrienCoulet, Natalya F. Noy, Mark A. Musen, and Nigam H. Shah for “NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources”.
Their entry provides very clear benefits to the biomedical community, bringing together knowledge from many different sources on the web with a large corpus of scientific literature though the clever application of Semantic Web technologies and principles.
The second prize in the Open Track was awarded to the team from Rensselaer Polytechnic Institute, comprising Dominic DiFranzo, Li Ding, John S. Erickson, Xian Li, Tim Lebo, James Michaelis, Alvaro Graves, Gregory Todd Williams, Jin GuangZheng, Johanna Flores, ZhenningShangguan, Gino Gervasio, Deborah L. McGuinness and Jim Hendler, for the development of “TWC LOGD: A Portal for Linking Open Government Data”
– a massive semantic effort in opening-up and linking public US government data and providing the ecosystem and education for using this data in different contexts.
- Ding, L., Lebo, T., Erickson, J.S., DiFranzo, D., Williams, G.T., Li, X., Michaelis, J., Graves, A., Zheng, J.G., Shangguan, Z. and Flores, J., 2011. TWC LOGD: A portal for linked open government data ecosystems. Web Semantics: Science, Services and Agents on the World Wide Web, 9(3), pp.325-333.
The third prize in the 2010 Open Track was won by a combined team from the Karlsruhe Institute of Technology, Oxford University and the University of Southern California comprising of Denny Vrandecic, VarunRatnakar, Markus Krötzsch, and Yolanda Gil for their entry “Shortipedia”
– a Web-based knowledge repository and collaborative curating system, pulling together a growing number of sources in order to provide a comprehensive, multilingual and diversified view on entities of interest – a kind of “Wikipedia on steroids”.
- Vrandečić, D., Ratnakar, V., Krötzsch, M. and Gil, Y., 2011. Shortipedia aggregating and curating Semantic Web data. Web Semantics: Science, Services and Agents on the World Wide Web, 9(3), pp.334-338.
The 2010 SWC Billion Triples Track was won by “Creating voiD Descriptions for Web-scale Data” by Christoph Böhm, Johannes Lorey, Dandy Fenz, EykKny, Matthias Pohl, Felix Naumann from HassoPlattner Institute in Potsdam, Germany. This entry uses state of the art parallelization techniques, and some serious cloud computing power, to dissect the enormous Billion Triples dataset into topic-specific views.
The 2009 Semantic Web Challenge was organized by Christian Bizer, (then at Freie Universität Berlin, Berlin, Germany) and Peter Mika (then at Yahoo! Research )and the finals took place at the 8th International Semantic Web Conference (ISWC 2009) near Washington, DC.
As in the previous year, the challenge consisted of two tracks: The Open Track and the Billion Triples Track. The Open Track requires that the applications utilize the semantics (meaning) of data and that they are designed to operate in an open Web environment, whilst the Billion Triples Track focuses on dealing with very large amounts of RDF data, which has been crawled from the Web and thus exhibits characteristics like vocabulary heterogeneity and varying data quality. For the Billion Triples Track, the organizers provided the participants an RDF data set consisting of 1.1 billion triples.
The winners of the 2009 SWC Open Track were Chintan Patel, Sharib Khan, and Karthik Gomadam from Applied Informatics, Inc with their application TrialX.
TrialX enables finding new treatments by intelligently matching patients to clinical trials using advanced medical ontologies to combine several electronic health records with user generated information.
The second prize was awarded to Andreas Harth from the Institute of Applied Informatics and Formal Description Methods, Universität Karlsruhe, Germany for the Semantic Web search engine VisiNav. The application enables end-users to ask complex queries against a large corpus of Web data and offers an innovative user interface for the explorative formulation of queries.
- Harth, A., 2010. VisiNav: A system for visual search and navigation on web data. Web Semantics: Science, Services and Agents on the World Wide Web, 8(4), pp.348-354.
The third prize in Open Track was awarded to Giovanni Tummarello, Richard Cyganiak, Michele Catasta, Szymon Danielczyk, and Stefan Decker from the Digital Enterprise Research Institute, Ireland for the development of Sig.ma, a Semantic Web Search engine which integrates and merges data about entities from a large, open set of Web data sources. A very innovative aspect of the application is the methods that it provides to its users for dealing with the information quality challenges that arise in the open Web setting.
- Tummarello, G., Cyganiak, R., Catasta, M., Danielczyk, S., Delbru, R. and Decker, S., 2010. Sig. ma: Live views on the web of data. Web Semantics: Science, Services and Agents on the World Wide Web, 8(4), pp.355-364.
The 2009 SWC Billion Triples Track was won by Scalable Reduction by Gregory Todd Williams, Jesse Weaver, Medha Atre, and James Hendler from the Rensselaer Polytechnic Institute, USA.
The entry showed how massive parallelization can be applied to quickly clean and filter large amounts of RDF data.
The 2008 Semantic Web challenge was organized by Peter Mika ( then of Yahoo Research) , and Jim Hendler(Rensselaer Polytechnic Institute, USA) and finals took place at the 7th International Semantic Web Conference , held in Karlsruhe , Germany on October 26 to 30 , 2008
To stimulate research into the scaling of Semantic Web technologies to deal with really Big Data, a second track was added to the challenge in 2008. This new Billion Triples Track required the participants to make use of a data set – consisting of about a billion triples – provided by the organizers. The goal of this challenge was not to be a benchmarking effort between triple stores, but rather to demonstrate applications that could scale to actual Web size using realistic Web-quality noisy data.
The winner of the 2008 SWC open track was Paggr, a system for creating Web widgets for linked data queries. Essentially, Paggr is an online tool that provides novel ways to manage and repurpose information on the web when that info is available in Semantic Web formats. Paggr allowed access to a number of different datasets and query types, making it a very flexible way to interact with linked data systems.
The second place winner of the 2008 SWC open track was DBpedia Mobile, an application for browsing DBpedia information via a mashup with geonames and other location data presented on a map (and available via a mobile client). DBPedia, as most readers know, is a central node in the growing linked data cloud, and this application showed the power of using this Wikipedia-based dataset in conjunction with various geospatial data.
- Becker, C. and Bizer, C., 2009. Exploring the geospatial semantic web with dbpedia mobile. Web Semantics: Science, Services and Agents on the World Wide Web, 7(4), pp.278-286.
Third place in the 2008 Open track was taken by HealthFinland, a semantic web-based portal for managing health information. Using Semantic technologies, HealthFinland is able to customize health information found on the Web to the needs of a particular user. Thus, a younger person looking for information on, say, diet, would find different guidance than might a more senior citizen. A number of health ontologies are used in producing the semantics used in the system, and interfaces are provided both to the human user, via a Web-based frontend, and to a machine, via a widget-based service interface.
- Suominen, O., Hyvönen, E., Viljanen, K. and Hukka, E., 2009. HealthFinland—A national semantic publishing network and portal for health information. Web Semantics: Science, Services and Agents on the World Wide Web, 7(4), pp.287-297.
The winner of the 2008 SWC Billion Triples Track was SemaPlorer, a system allowed users to interactively explore and visualize a very large subset of the billion triple data semantic data set in real-time. Its use case was to allow a user to learn about a city, tourist area, or other area of interest. By visualizing the data using a map, media, and different context views, the system goes beyond the simple storage and retrieval of large numbers of triples. SemaPlorer leveraged a number of the different semantic data sources such in the Billion Triple Dataset, including DBpedia, GeoNames, WordNet, and personal FOAF files. It also connected with a large Flickr data set converted to RDF, thus extending the triples provided to make for a map-based browser of a very large dataset.
Third place in the competition was Marvin, a system which used a novel method to generate a sound, but not necessarily complete, closure of the billion triples data. Marvin distributed the data to many processors, materialized locally, and then “reshuffled” the data among processors. In this way, the closure probabilistically approaches within epsilon of the complete system, but with very efficient use of multiple processors in a cluster or other distributed machine.