Skip to main content
European data
data.europa.eu
Il-portal uffiċjali għad-data Ewropea

What is analysed by the Metadata Quality Assurance tool?

The datasets stored in the portal need to be of an appropriate quality in terms of: 

  • DCAT-AP-compliant mapping
  • Available distributions
  • Usage of machine-readable distribution formats
  • Usage of known open-source licences.

To check the datasets for these quality indicators the Metadata Quality Assurance (MQA) tool was developed. The MQA runs as a periodic process in parallel to the harvesting. CKAN and Virtuoso are filled with metadata through the harvesting process. As CKAN cannot store DCAT-AP-formatted datasets directly, the datasets are mapped into a JSON (JavaScript Object Notation) schema that is DCAT-AP compliant. The MQA uses this schema for checking each dataset for its DCAT-AP mapping compliance. If there are any compliance issues detected, for instance if a mandatory field is missing, the dataset is considered as not DCAT-AP compliant. 

The MQA presents its results in two views. 

  • The landing page or ‘Global Dashboard’. This view shows aggregated results for the entire service, i.e. the quality details for all catalogues.
  • The second view or ‘Catalogue Dashboard’. This view allows you to select a specific catalogue for which you want to display the quality details.

The current quality indicators include the following. 

  1. Distribution statistics: 

    1. accessible distributions
    2. error status codes
    3. download URL
    4. existence, 
    5. top 20 catalogues with most accessible distributions, 
    6. ratio of machine-readable datasets, 
    7. most-used distribution formats, 
    8. top 20 catalogues mostly using common machine-readable datasets. 
  2. Dataset compliance statistics:
    1. top violation occurrences, 
    2. compliant datasets, 
    3. top 20 catalogues with most DCAT-AP-compliant datasets. 
  3. Dataset licence usage: 
    1. ratio of known to unknown licences, 
    2. most used licences, 
    3. top 20 catalogues with most datasets with known licences.
FAQ Section
Using the portal

Text of this article