Explore the insights about our second webinar for data providers on technical openness
On Friday 21 October, the webinar ‘Understanding open data: technical openness’ took place. This webinar was the second of a series of trainings organised by data.europa.eu academy to support data providers in the data publishing process. Specifically, this webinar focused on the need for open data to be technically open, i.e., freely accessible and available in non-proprietary and machine-readable formats.
From a technical perspective, openness can also be described according to degrees, following Tim Berners-Lee´s 5-Stars Model, with the first degree meaning that data is available on the web under open license and the fifth degree where data is linked to other data. A good degree level to start with is three stars, making data available in non-proprietary format. Appropriate formats are CSV, JSON, XML, RDF, while PDF, DOCX, ODT, PNG, GIF, JPG/JPEG, TIFF, DOC, XLS should rather be avoided.
To overcome a low technical openness, which can often be due to a lack of time and/or knowledge on the side of the data publishers, it is essential to introduce data management in the organisation. From a strategic perspective, this means:
- Creating a clear and functioning data governance framework that touches upon data quality, maintenance, privacy, and compliance;
- Defining concrete tasks, responsibilities, and roles;
- Building competences among employees, starting from the recommendations for delivering high-quality data contained in the Data Quality Guidelines by the Publications Office of the European Union.
From an operational level, data needs to be carefully prepared before publication. This means using an interactive and agile process to explore, combine, clean, and transform raw data into curated, high-quality datasets.
Complementary to these tips, Jakub Klímek (External expert at the Ministry of the Interior of the Czech Republic) shared his lessons learned regarding the technical and cross-cutting layers of open data publication, including:
- Motivation to publish - ´Good open data publication is rewarded by usage of the data` which produces further benefits for all;
- Distribution design (i.e., bulk download, API etc.), taking into consideration that ´there is no single best format for every case, so there may be a need for multiple formats`;
- Web technologies used, such as HTTPS, IPv4 and IPv&, CORS etc.;
- Cataloguing, in order to make the catalogue readable and harvestable also by other data catalogues;
- Processing the feedback of both ´power users` (more expert and engaged data consumers) and novices or potential users;
- Evaluating the data after publication.
Throughout the webinar, the audience had the chance to ask questions and interact with the speakers. If you are also curious to know which kind of data is ´for the general public`, whether content negotiation is recommended, and what types of data formats best allow technical openness of image data, watch the recording and consult the speakers´ slides on the ´Learning corner for data providers`