-
DGT-Translation Memory
DGT-TM is a translation memory (sentences and their manually produced translations) in 24 languages. It contains segments from the Acquis Communautaire, the body of European legislation,...
ZIP (45005 views) (4502 Downloads)
-
EuroVoc
EuroVoc is a multilingual, multidisciplinary thesaurus covering the activities of the EU. It contains terms in 24 EU languages (Bulgarian, Croatian, Czech, Danish, Dutch,...
XML HTML RDF XML ZIP (37050 views) (319 Downloads)
-
[DEPRECATED] Official Journals of the European Union (English)
This Dataset has been deprecated, and it is now replaced by the following datasets: Official Journals of the European Union 2021 Official Journals of the European Union 2020...
PDF HTML Formex 4 ZIP Excel XLS (4713 views) (7 Downloads)
-
[DEPRECATED] Official Journals of the European Union (Romanian)
This Dataset has been deprecated, and it is now replaced by the following datasets: Official Journals of the European Union 2021 Official Journals of the European Union 2020...
PDF HTML Formex 4 ZIP Excel XLS (1616 views) (1465 Downloads)
-
Romanian – English parallel wordlists (Processed)
English and Romanian lemmatized wordlists extracted from various resources (including RO-EN Wordnets, the Romanian – English news corpus, the Romanian – English literature corpus, and...
ZIP (297 views) (198 Downloads)
-
EIR Romanian-English TM (ECHR-33234/12) (Processed)
Converted ECHR translation memory EN-RO (CASE OF AL NASHIRI v. ROMANIA - Application no. 33234/12); This dataset has been created within the framework of the European Language Resource...
ZIP (276 views) (169 Downloads)
-
EIR Romanian-English Newsletter (2009-March 2011) (Processed)
Translation units were extracted from a collection of 392 files (386 Word and 6 Excel files) in the domain of European affairs (the main 4 EIR’s key areas: studies, training, translation...
ZIP (280 views) (180 Downloads)
-
Parallel Global Voices (English - Romanian) (Processed)
Parallel Global Voices EN-RO is a parallel corpus generated from the Global Voices multilingual group of websites (http://globalvoices.org/), where volunteers publish and translate news...
ZIP (280 views) (181 Downloads)
-
Monolingual Romanian corpus in the public administration domain (Processed)
Monolingual Romanian corpus, containing 360833 sentences (9064764 words) in the public administration domain. This dataset has been created within the framework of the European Language...
ZIP (295 views) (186 Downloads)
-
Romanian Parliament Transcripts 1996-2018 (Processed)
The data is obtained from cdep.ro website and contains 500k+ instances of speech from the parliament podium from 1996 to 2018. Sentence splitting and deduplication onm sentence level have...
ZIP (210 views) (127 Downloads)
-
EUIPO - Trade mark Guidelines (October 2017) (English-Romanian) (Processed)
The EUIPO Guidelines are the main point of reference for users of the European Union trade mark system and professional advisers who want to make sure they have the latest information on...
ZIP (239 views) (141 Downloads)
-
EIR terminology (banking) (RO-EN) (Processed)
banking terms (RO, EN) This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation...
ZIP (261 views) (173 Downloads)
-
EIR terminology (legal) (RO-EN) (Processed)
legal terminology terminology (CJUE: legal glossary and entries extracted from the Treaty of Lisbon; RO, EN) This dataset has been created within the framework of the European Language...
ZIP (191 views) (121 Downloads)
-
EIR Romanian-English SPOS (2011-2017) (Processed)
Translation Units were extract from 18 Word files (9 Romanian and 9 English) in the field of European Affairs - Strategy and Policy Studies (SPOS); 101 849 words (in Romanian) This...
ZIP (251 views) (150 Downloads)
-
Rural Development Programme of Romania (Processed)
Rural Development Programme of Romania available at http://madr.ro (Ministry of Agriculture and Rural Development) This dataset has been created within the framework of the European...
ZIP (157 views) (95 Downloads)
-
Monolingual Romanian corpus in the culture domain (Processed)
Monolingual Romanian corpus, including content from public websites related to culture This dataset has been created within the framework of the European Language Resource Coordination...
ZIP (314 views) (209 Downloads)
-
Monolingual Romanian corpus in the law domain (Processed)
Monolingual (ron) corpus, including content from public websites related to law-justice This dataset has been created within the framework of the European Language Resource Coordination...
ZIP (158 views) (105 Downloads)
-
Romanian New Civil Procedure Code (Processed)
The New Civil Procedure Code in Romanian (monolingual) comprising 297888 words. This dataset has been created within the framework of the European Language Resource Coordination (ELRC)...
ZIP (280 views) (171 Downloads)
-
General Romanian-English bilingual corpus
Romanian – English corpus built from a Wikipedia dump. This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility...
ZIP (504 views) (385 Downloads)
-
Monolingual Romanian corpus in the public administration domain
Monolingual Romanian corpus, containing 12037387 tokens and 1176117 lexical types in the public administration domain. This dataset has been created within the framework of the European...
ZIP (611 views) (495 Downloads)