An overview of COVID-19 and the available open datasets on the European Data Portal and beyond
Coronavirus and COVID-19
It was first reported in December 2019 in Wuhan City, China. Since the first diagnosed case in China, COVID-19 has rapidly spread across the world. As of 16 March 2020, there have been COVID-19 cases detected in more than 140 countries in Asia, Australia, Europe, Africa, North America and South America. At this date, Europe is now the epicentre of COVID-19 with the largest number of confirmed cases in Italy. In most western countries, case numbers have been increasing by about 33% per day.
This virus, like the other coronaviruses, targets people’s respiratory system. Symptoms of COVID-19 range from mild symptoms, to severe illness, and potentially death. These symptoms include a fever, cough, and shortness of breath that may appear between 2 to 14 days after exposure. Due to its range of symptoms and severity, it has been hard to track the virus.
Open data and COVID-19
Since awareness on COVID-19 began growing across the world, more health datasets have been published as open for (re-)users to utilise in creating platforms and interactive maps, for example, to support citizens in taking steps to stay healthy, like avoiding risk areas. Examples can be seen in national open data portals and the health ministries across Europe. The picture below shows the Dutch Ministry for Health, Wellbeing and Sport’s (RIVM) interactive map of COVID-19 cases. The download icon in the bottom right corner also allows the user to download the image, or the raw data behind it.
Figure 1: COVID-19 mapped in the Netherlands by RIVM
To new readers, the European Data Portal (EDP) acts as a single access to point to open data that is published by national open data portals and institutions in EU Member States and additional countries. At the moment of writing, there are 64 datasets on EDP that reference “covid” or “corona”, and very likely many others are relevant to researchers to understand the current situation better, such as datasets describing health infections, epidemics or pandemics. This can be seen in the image below.
Figure 2: Open datasets on COVID-19 in Europe
Examples of these datasets are:
- “Key figures concerning the COVID-19 epidemic in France” published by the French national open data portal (data.gouv.fr).
- “Confirmed cases of COVID-19 infection by region” published by the French national open data portal (data.gouv.fr).
- “COVID-19 Coronavirus data” – a dataset that provides surveillance and disease data for COVID-19 Coronavirus worldwide published by the European Union Open Data Portal (data.europa.eu).
In addition, a very relevant example of emergency-driven open data publishing are the efforts of the Italian “Dipartimento per la Protezione Civile” (Agency for the Protection of Civilians). The agency has shared all of their COVID-19 data on GitHub, including national trends, provinces data, regions data, summary cards, and areas. These datasets are already available for adding APIs and English translations.
This publication demonstrates – indirectly – the importance of data not only to be available, but also discoverable. Without taking anything off the merit of such a valuable initiative, it was likely published in a rush to support the emergency and, at the moment of writing, the dataset unfortunately cannot be found on the Italian national data portal and – as a consequence – is not listed on the EDP either, as we depend on publication on the national portals. The GitHub repository also cannot be found on mainstream web search engines such as Google. Thus, some steps still need to be taken to aid the discoverability of datasets.
Finally, it is important to highlight that not everybody is qualified to correctly interpret such specialistic data. In the past, the term “armchair auditor” has been used in a derogatory way to point at the risk of open data being unwittingly used to run analysis or draw conclusions on phenomena that go beyond the skills of the researcher. This is the case for COVID-19. As much as the work of EDP and of the other data teams in civil service will keep empowering you, please do not discard expert advice and the mainstream official news channels for information on the medical emergency.
The EDP team takes the opportunity to invite our followers to be safe and to follow recommended safety guidelines.
For more information or examples on COVID-19, explore the EDP’s news archive and feature highlights section. Aware of more open data examples around the virus? Share them with us via mail and follow us on Twitter, Facebook or LinkedIn to stay up to date!