-
Romanian – English literature corpus
Bilingual Romanian - English literature corpus built from a small set of freely available literature books (drama, sci-fi, etc.). The texts are positionally aligned, i.e. the sentence on...
ZIP (411 views) (321 Downloads)
-
Bilingual hr-en parallel corpus from Croatian Mine Action website (Processed)
Contents of http://www.hcr.hr website downloaded, aligned on document and segment level and converted into parallel corpus This dataset has been created within the framework of the...
ZIP (397 views) (297 Downloads)
-
Parallel corpus from Social Insurance Agency - Socialstyrelsen (Sweden)
Big term bank with Medical terms in Swedish with an explanation for each term in Swedish This dataset has been created within the framework of the European Language Resource Coordination...
ZIP (389 views) (294 Downloads)
-
Bilingual hr-en parallel corpus from the National and University Library in Zagreb website (Processed)
Contents of http://www.nsk.hr were crawled, aligned on document and sentence level and converted into a parallel corpus This dataset has been created within the framework of the European...
ZIP (384 views) (290 Downloads)
-
Polish-English parallel corpus from the website of the Citizens Information Board (Processed)
Polish-English parallel corpus from the website of the Citizens Information Board, Ireland (http://www.citizensinformation.ie) This dataset has been created within the framework of the...
ZIP (373 views) (269 Downloads)
-
English-Lithuanian EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)
EASTIN-CL Multilingual Ontology of Assistive Technology was created within the EASTIN-CL project aimed at applying language technologies to portal of assistive technologies...
ZIP (368 views) (256 Downloads)
-
Romanian-English corpus with studies, reports and statistical data in the field of culture from the National Institute for Cultural Research and Training website (Processed)
Romanian-English corpus with studies, reports and statistical data in the field of culture from the National Institute for Cultural Research and Training website This dataset has been...
ZIP (362 views) (254 Downloads)
-
Bilingual English-Danish parallel corpus from The Agency for Culture and Palaces website
Contents of https://slks.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of the European...
ZIP (359 views) (255 Downloads)
-
English-Slovak parallel corpus of texts from The Ministry of Culture of the Slovak Republic
Dataset of various English-Slovak legal texts within agenda of the Ministry, plain text format alligned at the sentence level, the size: 105791 words This dataset has been created within...
ZIP (357 views) (249 Downloads)
-
Parallel texts from Swedish Social Security Authority (Processed)
Parallel texts, email templates and forms in pdf file format. Original in Swedish, all the other texts are translations. One original with translations per folder. Language info is...
ZIP (349 views) (238 Downloads)
-
Polish-English parallel corpus from the website of the National Audiovisual Institute (Processed)
Polish-English parallel corpus from the website of the National Audiovisual Institute (http://www.nina.gov.pl) This dataset has been created within the framework of the European Language...
ZIP (348 views) (237 Downloads)
-
National Health Fund Dataset (Processed)
The dataset is a 274K-token Polish-English parallel resource in XLIFF format created on the basis of "Diagnosis-Related Groups in Europe" publication of the Polish National Health Fund....
ZIP (345 views) (231 Downloads)
-
Bilingual English-Danish parallel corpus from The Danish Medicines Agency website
Contents of https://laegemiddelstyrelsen.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework...
ZIP (340 views) (233 Downloads)
-
Polish-English parallel corpus from the website of the National Science Centre (Processed)
Polish-English parallel corpus from the website of the National Science Centre (http://ncn.gov.pl) This dataset has been created within the framework of the European Language Resource...
ZIP (337 views) (224 Downloads)
-
Parallel texts from Swedish Social Security Authority
Parallel texts, email templates and forms in pdf file format. Original in Swedish, all the other texts are translations. One original with translations per folder. Language info is...
ZIP (336 views) (233 Downloads)
-
Polish-English parallel corpus from the website of the Ministry of the Interior and Administration (Processed)
Polish-English parallel corpus from the website of the Ministry of the Interior and Administration, Republic of Poland (https://www.mswia.gov.pl/) This dataset has been created within...
ZIP (334 views) (241 Downloads)
-
English-Latvian EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)
EASTIN-CL Multilingual Ontology of Assistive Technology was created within the EASTIN-CL project aimed at applying language technologies to portal of assistive technologies...
ZIP (328 views) (226 Downloads)
-
Compendium The Social Insurance Institution (Processed)
A compendium on the Polish Social Insurance Insitution (ZUS), covering the following issues: short presentation of ZUS, its history, tasks, organizational structure, employees, Social...
ZIP (323 views) (203 Downloads)
-
English-Swedish parallel corpus from the web site of the Swedish Migration Board - Migrationsverket (Processed)
All texts have been collected from their website of the Swedish Migration Board. The original text is always in Swedish, the other texts are translations from Swedish. This dataset has...
ZIP (312 views) (221 Downloads)
-
English-Slovak corpus of annual reports on immigration and asylum policies from the EMN National Contact Point for the Slovak Republic website (Processed)
English-Slovak corpus of annual reports on immigration and asylum policies from the EMN National Contact Point for the Slovak Republic website (https://emn.sk/en/) This dataset has been...
ZIP (310 views) (209 Downloads)