-
EuroVoc
EuroVoc is a multilingual, multidisciplinary thesaurus covering the activities of the EU. It contains terms in 24 EU languages (Bulgarian, Croatian, Czech, Danish, Dutch,...
XML HTML RDF XML ZIP (37050 views) (319 Downloads)
-
DGT-Translation Memory
DGT-TM is a translation memory (sentences and their manually produced translations) in 24 languages. It contains segments from the Acquis Communautaire, the body of European legislation,...
PDF ZIP (45005 views) (4502 Downloads)
-
Agricultural and Vegetable Catalogue
The seed of varieties of agricultural and plant species and varieties of vegetable species that are published in the EU level Common Catalogue is subject to no marketing restrictions with...
HTML ZIP (1733 views) (1602 Downloads)
-
[DEPRECATED] Official Journals of the European Union (Polish)
This Dataset has been deprecated, and it is now replaced by the following datasets: Official Journals of the European Union 2021 Official Journals of the European Union 2020...
PDF HTML Formex 4 ZIP Excel XLS (1622 views) (1477 Downloads)
-
[DEPRECATED] Official Journals of the European Union (English)
This Dataset has been deprecated, and it is now replaced by the following datasets: Official Journals of the European Union 2021 Official Journals of the European Union 2020...
PDF HTML Formex 4 ZIP Excel XLS (4713 views) (7 Downloads)
-
Letter of rights for persons arrested on the basis of a European Arrest Warrant (Processed)
Letter of rights for persons arrested on the basis of a European Arrest Warrant (EAW), 1 page, (Processed) This dataset has been created within the framework of the European Language...
ZIP (666 views) (557 Downloads)
-
National Health Fund Dataset (Processed)
The dataset is a 274K-token Polish-English parallel resource in XLIFF format created on the basis of "Diagnosis-Related Groups in Europe" publication of the Polish National Health Fund....
ZIP (345 views) (231 Downloads)
-
Letter of rights for persons arrested and or detained
Police form, 12 pages. This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation...
ZIP (413 views) (296 Downloads)
-
Polish Court Rulings Corpus (Processed)
The Polish Court Rulings Corpus contains 62 726 rulings of Polish courts, over 178 million words of running text. The texts of the rulings together with some metadata were acquired from...
ZIP (285 views) (174 Downloads)
-
Polish Ministry of Foreign Affairs reports in EN and PL (Processed)
The dataset comprises the EN and PL versions of two reports created by the Polish Ministry of Foreign Affairs, “Rules for communicating the POLSKA brand” and “Polish Presidency of the...
ZIP (407 views) (303 Downloads)
-
Monolingual Polish corpus in the public administration domain
Monolingual Polish corpus, containing 22372690 tokens and 1805280 lexical types in the public administration domain. This dataset has been created within the framework of the European...
ZIP (431 views) (317 Downloads)
-
Polish-English parallel corpus from the website of the National Digital Archives (Processed)
Polish-English parallel corpus from the website of the National Digital Archives (https://www.nac.gov.pl) This dataset has been created within the framework of the European Language...
ZIP (412 views) (313 Downloads)
-
Parallel corpus (en-pl) from the Export Promotion Portal of Poland (Processed)
A paralell corpus constructed from data acquired form the *.trade.gov.pl websites This dataset has been created within the framework of the European Language Resource Coordination (ELRC)...
ZIP (405 views) (289 Downloads)
-
Parallel texts from Swedish Work environment Authority (Processed)
Parallel texts from the Swedish Work Environment authority, all in pdf format. Original in Swedish, all the other texts are translations. One original with translations per folder....
ZIP (615 views) (487 Downloads)
-
Letter of rights for persons arrested and or detained (Processed)
Collection of transaltion units (1906 in total) in 21 language pairs extracted from 7 Police forms (one form 12 pages long in each of the following languages: BG, EL, EN, FR, LV, PL, RO)....
ZIP (452 views) (338 Downloads)
-
Parallel texts from Swedish Labour market agency. Part 2 (Processed)
Same as part 1, but with the Readme-file. (Processed) This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility...
ZIP (439 views) (332 Downloads)
-
Polish-English Internal Aviation Glossaries (Processed)
A set of bilingual glossaries developed by the Civil Aviation Authority of Republic of Poland, totalling 8548 Polish and English terms with commentaries and reference notes, including...
ZIP (269 views) (175 Downloads)
-
Polish-English parallel corpus from the website of the Ministry of Digitization (Processed)
Polish-English parallel corpus from the website of the Ministry of Digitization, Republic of Poland (http://mac.gov.pl) This dataset has been created within the framework of the European...
ZIP (330 views) (229 Downloads)
-
Polish-English parallel corpus from the website "Polish Aid" (Processed)
Polish-English parallel corpus from the website of the website "Polish Aid" (http://www.polskapomoc.gov.pl) This dataset has been created within the framework of the European Language...
ZIP (339 views) (233 Downloads)
-
Polish-English parallel corpus from the website of the U.S. EMBASSY and CONSULATE IN POLAND (Processed)
Polish-English parallel corpus from the website of the U.S. EMBASSY and CONSULATE IN POLAND (https://pl.usembassy.gov/) This dataset has been created within the framework of the European...
ZIP (385 views) (270 Downloads)