Resources for Language Technologies
-
German-Portuguese website parallel corpus from the Federal Foreign Office Berlin
German-Portuguese texts extracted from the website of the Federal Foreign Office Berlin. This includes 415 pairs that were translated between September 2013 and the beginning of December...
XML PDF ZIP (535 views) (427 Downloads)
-
Spanish-German website parallel corpus (Processed)
This is a parallel corpus of bilingual texts crawled from multilingual websites, which contains 2,840 TUs. Period of crawling : 15/11/2016 - 23/01/2017. A strict validation process...
ZIP (194 views) (126 Downloads)
-
Health Multilingual Terminologies
17 multilingual medical terminologies from Termcat in the following domains: - Anatomy (3610 terms; languages: es, en, ca) - Integrated care (75 terms; languages: es, en,ca) -...
XML PDF ZIP (739 views) (628 Downloads)
-
Parallel texts from Swedish Social Security Authority (Processed)
Parallel texts, email templates and forms in pdf file format. Original in Swedish, all the other texts are translations. One original with translations per folder. Language info is...
ZIP (349 views) (238 Downloads)
-
Parallel texts from Swedish Social Security Authority
Parallel texts, email templates and forms in pdf file format. Original in Swedish, all the other texts are translations. One original with translations per folder. Language info is...
ZIP (336 views) (233 Downloads)
-
Glossary City of Vienna (Processed)
english-german, german-english glossary of terms used by public officials working for the City of Vienna as at 04/2016 This dataset has been created within the framework of the European...
ZIP (34 views) (13 Downloads)
-
Audioguide for the Military History Museum in Vienna
Translation for the audioguide from the military history museum in Vienna, created in March 2014 This dataset has been created within the framework of the European Language Resource...
ZIP (395 views) (274 Downloads)
-
Multilingual Public Procurement Terminology
An internal terminology developed by the Polish Public Procurement Office containing 1408 terms in 11 languages (English, Danish, Spanish, German, Greek, French, Italian, Portugese,...
XML PDF ZIP (1046 views) (930 Downloads)
-
Parallel texts from Swedish Labour market agency. Part 2
Same as part 1, but with the Readme-file. This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated...
ZIP (548 views) (435 Downloads)
-
The Vocabulary of Safety and Health at Work (TSK 35)
The Vocabulary of Safety and Health at Work (TSK 35) contains 465 concepts with Finnish term recommendations, definitions and notes. The equivalents are given in Swedish, English, German...
XML PDF ZIP (583 views) (478 Downloads)
-
BMI Brochures 2011-2015 (Processed)
English translations of German BMI brochures from the last four years, in TMX format. TMX format has been corrected and the resulting file stripped This dataset has been created...
ZIP (551 views) (443 Downloads)
-
BMI Brochures and Website 2016
Bilingual tmx file of German to English translations of the Federal Ministry of the Interior's website and brochures. Topics include terrorism, cyber security, asylum, cultural property,...
XML PDF ZIP (705 views) (602 Downloads)
-
German-English website parallel corpus from the Federal Foreign Office Berlin
German-English texts extracted from the website of the Federal Foreign Office Berlin. This includes 53,849 pairs that were translated between October 2013 and the beginning of November...
XML PDF ZIP (1379 views) (1177 Downloads)
-
Glossary City of Vienna
english-german, german-english glossary of terms used by public officials working for the City of Vienna as at 04/2016 This dataset has been created within the framework of the European...
XML PDF ZIP (410 views) (300 Downloads)
-
SIP Publications
Publications from the Luxembourgish government edited by Service information et presse - 11538 Translation Units This dataset has been created within the framework of the European...
XML PDF ZIP (478 views) (363 Downloads)
-
Avibase (processed)
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action SMART...
ZIP (258 views) (152 Downloads)
-
SIP Dictionary of places and people (Luxembourg)
TMX containing 1777 terms translated between English, French and German
XML PDF ZIP (561 views) (471 Downloads)