-
DGT-Translation Memory
DGT-TM is a translation memory (sentences and their manually produced translations) in 24 languages. It contains segments from the Acquis Communautaire, the body of European legislation,...
PDF ZIP (45005 views) (4502 Downloads)
-
COVID-19 multilingual terminology in IATE
The dataset is a collection of multilingual entries related to the SARS-CoV-2 virus and the COVID-19 pandemic, available in IATE, the European Union terminology database. It is a...
Excel XLSX (1490 views) (122 Downloads)
-
IATE
IATE (= “Inter-Active Terminology for Europe”) is the EU's inter-institutional terminology database. IATE has been used by the language services of the EU institutions and agencies since...
HTML JavaScript ZIP (6456 views) (6082 Downloads)
-
National Health Fund Dataset (Processed)
The dataset is a 274K-token Polish-English parallel resource in XLIFF format created on the basis of "Diagnosis-Related Groups in Europe" publication of the Polish National Health Fund....
ZIP (345 views) (231 Downloads)
-
DA-EN Danish Ministry of Higher Education and Science 2
Parallel texts Danish-English from the Danish Ministry of Higher Education and Science, size 115,000 words, topic: research policy This dataset has been created within the framework of...
ZIP (333 views) (209 Downloads)
-
English-Slovak parallel corpus of texts from The Ministry of Culture of the Slovak Republic
Dataset of various English-Slovak legal texts within agenda of the Ministry, plain text format alligned at the sentence level, the size: 105791 words This dataset has been created within...
ZIP (357 views) (249 Downloads)
-
Romanian – English literature corpus
Bilingual Romanian - English literature corpus built from a small set of freely available literature books (drama, sci-fi, etc.). The texts are positionally aligned, i.e. the sentence on...
ZIP (411 views) (321 Downloads)
-
English-Estonian corpus from Finnish Information Bank (Processed)
http://www.infopankki.fi - Finland in your language - Information about Finland - Moving to Finland - Living in Finland This dataset has been created within the framework of the European...
ZIP (288 views) (186 Downloads)
-
English-Swedish corpus from Finnish Information Bank (Processed)
http://www.infopankki.fi - Finland in your language - Information about Finland - Moving to Finland - Living in Finland This dataset has been created within the framework of the European...
ZIP (432 views) (327 Downloads)
-
English-Finnish corpus from Finnish Information Bank (Processed)
http://www.infopankki.fi - Finland in your language - Information about Finland - Moving to Finland - Living in Finland This dataset has been created within the framework of the European...
ZIP (496 views) (378 Downloads)
-
English-Estonian corpus from Finnish Information Bank
http://www.infopankki.fi - Finland in your language - Information about Finland - Moving to Finland - Living in Finland This dataset has been created within the framework of the European...
XML PDF ZIP (439 views) (337 Downloads)
-
English-Swedish corpus from Finnish Information Bank
http://www.infopankki.fi - Finland in your language - Information about Finland - Moving to Finland - Living in Finland This dataset has been created within the framework of the European...
XML PDF ZIP (641 views) (524 Downloads)
-
English-Finnish corpus from Finnish Information Bank
http://www.infopankki.fi - Finland in your language - Information about Finland - Moving to Finland - Living in Finland This dataset has been created within the framework of the European...
XML PDF ZIP (850 views) (724 Downloads)
-
English-Estonian Parallel corpus compiled from translated annual reports from Estonian Academy of Sciences
English-Estonian translated annual reports as source data for parallel corpus -- collected from the web site of Estonian Academy of Sciences http://www.akadeemia.ee/ This dataset has...
ZIP (296 views) (204 Downloads)
-
Bilingual documents Bulgarian-English in the field of open data, broadband and information society (Processed)
English-Bulgarian collection in the field of open data, broadband, strategic document of the Information society in the Republic of Bulgaria This dataset has been created within the...
ZIP (503 views) (389 Downloads)
-
English-Estonian EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)
EASTIN-CL Multilingual Ontology of Assistive Technology was created within the EASTIN-CL project aimed at applying language technologies to portal of assistive technologies...
ZIP (534 views) (420 Downloads)
-
Polish-English parallel corpus from the website of the National Digital Archives (Processed)
Polish-English parallel corpus from the website of the National Digital Archives (https://www.nac.gov.pl) This dataset has been created within the framework of the European Language...
ZIP (412 views) (313 Downloads)
-
DA-EN Danish Ministry of Higher Education and Science
Parallel texts Danish-English from the Danish Ministry of Higher Education and Science, size: 120,000 words, topic: innovation, science This dataset has been created within the framework...
ZIP (453 views) (360 Downloads)
-
Parallel Global Voices (Bulgarian - English) (Processed)
Parallel Global Voices BG-EN is a parallel corpus generated from the Global Voices multilingual group of websites (http://globalvoices.org/), where volunteers publish and translate news...
ZIP (622 views) (520 Downloads)
-
Romanian-English corpus with studies, reports and statistical data in the field of culture from the National Institute for Cultural Research and Training website (Processed)
Romanian-English corpus with studies, reports and statistical data in the field of culture from the National Institute for Cultural Research and Training website This dataset has been...
ZIP (362 views) (254 Downloads)