Salta al contenuto principale
European data
data.europa.eu
Il portale ufficiale dei dati europei

What is analysed by the Metadata Quality Assurance tool?

The datasets stored in the portal need to be of an appropriate quality in terms of: 

  • DCAT-AP-compliant mapping
  • Available distributions
  • Usage of machine-readable distribution formats
  • Usage of known open-source licences.

To check the datasets for these quality indicators the Metadata Quality Assurance (MQA) tool was developed. The MQA runs as a periodic process in parallel to the harvesting. CKAN and Virtuoso are filled with metadata through the harvesting process. As CKAN cannot store DCAT-AP-formatted datasets directly, the datasets are mapped into a JSON (JavaScript Object Notation) schema that is DCAT-AP compliant. The MQA uses this schema for checking each dataset for its DCAT-AP mapping compliance. If there are any compliance issues detected, for instance if a mandatory field is missing, the dataset is considered as not DCAT-AP compliant. 

The MQA presents its results in two views. 

  • The landing page or ‘Global Dashboard’. This view shows aggregated results for the entire service, i.e. the quality details for all catalogues.
  • The second view or ‘Catalogue Dashboard’. This view allows you to select a specific catalogue for which you want to display the quality details.

The current quality indicators include the following. 

  1. Distribution statistics: 

    1. accessible distributions
    2. error status codes
    3. download URL
    4. existence, 
    5. top 20 catalogues with most accessible distributions, 
    6. ratio of machine-readable datasets, 
    7. most-used distribution formats, 
    8. top 20 catalogues mostly using common machine-readable datasets. 
  2. Dataset compliance statistics:
    1. top violation occurrences, 
    2. compliant datasets, 
    3. top 20 catalogues with most DCAT-AP-compliant datasets. 
  3. Dataset licence usage: 
    1. ratio of known to unknown licences, 
    2. most used licences, 
    3. top 20 catalogues with most datasets with known licences.
FAQ Section
Using the portal

Text of this article