Ryšių tinklų, turinio ir technologijų generalinis direktoratas
-
Polish Court Rulings Corpus (Processed)
The Polish Court Rulings Corpus contains 62 726 rulings of Polish courts, over 178 million words of running text. The texts of the rulings together with some metadata were acquired from...
ZIP (285 rodiniai) (174 Siuntos)
-
Polish Ministry of Foreign Affairs reports in EN and PL (Processed)
The dataset comprises the EN and PL versions of two reports created by the Polish Ministry of Foreign Affairs, “Rules for communicating the POLSKA brand” and “Polish Presidency of the...
ZIP (407 rodiniai) (303 Siuntos)
-
Monolingual Polish corpus in the public administration domain
Monolingual Polish corpus, containing 22372690 tokens and 1805280 lexical types in the public administration domain. This dataset has been created within the framework of the European...
ZIP (431 rodiniai) (317 Siuntos)
-
Polish-English Internal Aviation Glossaries (Processed)
A set of bilingual glossaries developed by the Civil Aviation Authority of Republic of Poland, totalling 8548 Polish and English terms with commentaries and reference notes, including...
ZIP (269 rodiniai) (175 Siuntos)
-
Translations of Hungarian from public websites
A webcrawl of 14 different websites covering parallel corpora of Hungarian with Polish, Czech, Swedish, Finnish, French, German, Italian, English and Slovenian This dataset has been...
ZIP (388 rodiniai) (287 Siuntos)
-
International Statistical Classification of Diseases and Related Health Problems - ICD-10 (EN-PL) (Processed)
International Classification of Diseases is a widely used classification in healthcare and healthcare management, i.a. for and coding of diseases for financial settlement between...
ZIP (208 rodiniai) (142 Siuntos)
-
Monolingual Polish corpus in the culture domain (part1) (Processed)
Monolingual Polish corpus from the Warsaw - Official Tourist Website. This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting...
ZIP (291 rodiniai) (182 Siuntos)
-
Monolingual Polish corpus in the law domain (Processed)
Monolingual (pol) corpus, including content of websites that are relevant to law and justice This dataset has been created within the framework of the European Language Resource...
ZIP (298 rodiniai) (192 Siuntos)
-
Khresmoi (Processed)
Parallel data sets for development and testing of machine translation of sentences from summaries of medical articles between Czech, English, French, German, Hungarian, Polish, Spanish...
ZIP (266 rodiniai) (184 Siuntos)
-
EUIPO - Trade mark Guidelines (October 2017) (English-Polish) (Processed)
The EUIPO Guidelines are the main point of reference for users of the European Union trade mark system and professional advisers who want to make sure they have the latest information on...
ZIP (251 rodiniai) (137 Siuntos)
-
Polish-English parallel corpus from the website of the Ministry of the Interior and Administration (Processed)
Polish-English parallel corpus from the website of the Ministry of the Interior and Administration, Republic of Poland (https://www.mswia.gov.pl/) This dataset has been created within...
ZIP (334 rodiniai) (241 Siuntos)
-
Monolingual Polish corpus in the public administration domain (Processed)
Monolingual Polish corpus, containing 22372690 tokens and 1805280 lexical types in the public administration domain. This dataset has been created within the framework of the European...
ZIP (212 rodiniai) (115 Siuntos)
-
Monolingual corpus from Minutes of the Polish Senat (Posiedzenia) (2015-2018) (Processed)
The Monolingual Corpus from Minutes of the the Polish Senat (Posiedzenia) (2015-2018) is part of "The Polish Parliamentary Corpus" which is available at http://clip.ipipan.waw.pl/PPC ....
ZIP (259 rodiniai) (164 Siuntos)
-
Avibase (processed)
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action SMART...
ZIP (258 rodiniai) (152 Siuntos)