-
The UCD Bórd na Gaeilge Corpus of bilingual PDFs and Word documents
Parallel data provided by the language office at UCD (University College Dublin) Size: 3 Word documents, 67 PDFs This dataset has been created within the framework of the European...
ZIP (294 начини на показване) (196 Изтегляния)
-
Parallel corpus from Parliament of Estonia
Parallel corpus compiled from contents of website of Parliament of Estonia This dataset has been created within the framework of the European Language Resource Coordination (ELRC)...
ZIP (407 начини на показване) (307 Изтегляния)
-
Belgian parallel corpus about Belgium and the justice system
An automatically aligned parallel corpus of well-translated Belgian texts in Dutch and French. The corpus contains texts about Belgium and the Belgian justice system, with over 100.000...
ZIP (596 начини на показване) (488 Изтегляния)
-
Corpus of Icelandic texts from the Central Bank of Iceland
Corpus of Icelandic texts from the Central Bank of Iceland This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe...
ZIP (440 начини на показване) (329 Изтегляния)
-
Croatian monolingual corpus of the Official journal of the Republic of Croatia
The Croatian monolingual corpus of the Official journal of the Republic of Croatia is formated as the verticalized corpus with the line structure that resembles the simplified CoNLL...
ZIP (360 начини на показване) (273 Изтегляния)
-
Monolingual Polish corpus in the public administration domain (Processed)
Monolingual Polish corpus, containing 22372690 tokens and 1805280 lexical types in the public administration domain. This dataset has been created within the framework of the European...
ZIP (212 начини на показване) (115 Изтегляния)
-
OROSSIMO Corpus - Computer Science
A corpus of academic discourse texts belonging to the Computer Science domain (according to the Dewey Decimal classification, DDC00 - Computer science, knowledge & systems), annotated...
ZIP (368 начини на показване) (256 Изтегляния)
-
The Coimisineir Teanga Bilingual Web Corpus
Web content from the Language Commissioner's Office. Two TXT files containing 6808 words of parallel data This dataset has been created within the framework of the European Language...
ZIP (268 начини на показване) (180 Изтегляния)
-
Czech Banking Association Terminology
Terms in Czech - English relating to finance This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility -...
XML PDF ZIP (578 начини на показване) (488 Изтегляния)
-
Corpus on Finance and Economics from Bank of Latvia (Processed)
Contents of web site https://makroekonomika.lv/ -- Latvian and https://www.macroeconomics.lv/ -- English aligned as a parallel corpus This dataset has been created within the...
ZIP (285 начини на показване) (184 Изтегляния)
-
2015 Calls for Tenders for Translation
Contains monolingual Netherlands Dutch texts with the 2015 calls for tenders for translation work for the child welfare office, the office for prisoner rehabilitation and for the ministry...
ZIP (562 начини на показване) (458 Изтегляния)
-
Polish Ministry of Foreign Affairs Regional Dataset
A collection of Polish-English whitepapers published by the Polish Ministry of Foreign Affairs, including "Eastern Partnership" (10K words in 492 segments) and "Poland's 10 years in the...
XML PDF ZIP (547 начини на показване) (437 Изтегляния)
-
The Terminological Vocabulary of Kela – Benefit-related Concepts, 4th edition (TSK 49)
The Terminological Vocabulary of Kela – Benefit-related Concepts, 4th edition (TSK 49) contains information on more than 500 concepts in term records and concept diagrams. The concepts...
XML PDF ZIP (511 начини на показване) (421 Изтегляния)
-
Documents concerning Federal Constitutional Law in Austria
Alignment documents concerning Austrian Federal Constitutional Law This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting...
ZIP (458 начини на показване) (360 Изтегляния)
-
Guidelines - Judicial maps in Bulgarian
Guidelines on establishment of judicial mapping in Bulgarian This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe...
ZIP (283 начини на показване) (189 Изтегляния)
-
English-Danish EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)
EASTIN-CL Multilingual Ontology of Assistive Technology was created within the EASTIN-CL project aimed at applying language technologies to portal of assistive technologies...
ZIP (424 начини на показване) (316 Изтегляния)
-
Translation of the Luxembourg.lu web site
Translation Luxembourg.lu web site, consisting of 90293 Translation Units of French, German and English This dataset has been created within the framework of the European Language...
XML PDF ZIP (389 начини на показване) (290 Изтегляния)
-
DA-EN Danish Ministry of Higher Education and Science 3
Parallel texts Danish-English from the Danish Ministry of Higher Education and Science, size 110,000 words, topic: research policy This dataset has been created within the framework of...
ZIP (366 начини на показване) (264 Изтегляния)
-
Secretariat-General parallel corpus SL-EN and EN-SL (part 2)
English-Slovenian parallel corpus in TMX format from the Secretariat-General of the Government of the Republic of Slovenia in the legal domain This dataset has been created within the...
XML PDF ZIP (156 начини на показване) (128 Изтегляния)
-
Convention against Torture and Other Cruel, Inhuman or Degrading Treatment or Punishment - United Nations (French-English-Greek)
English text of the Convention against Torture and Other Cruel, Inhuman or Degrading Treatment or Punishment (United nations) and the ratifying bilingual (French - Greek) Greek law...
ZIP (408 начини на показване) (300 Изтегляния)