Resources for Language Technologies
-
DA-EN Danish Ministry of Higher Education and Science 2
Parallel texts Danish-English from the Danish Ministry of Higher Education and Science, size 115,000 words, topic: research policy This dataset has been created within the framework of...
ZIP (333 views) (209 Downloads)
-
Orossimo Terminological Resource - Computer Science
A bilingual terminological glossary extracted from academic discourse texts belonging to the Computer Science domain. This dataset has been created within the framework of the European...
XML PDF ZIP (539 views) (427 Downloads)
-
English-Bulgarian Computer Terms
The resource is a bilingual terminological database representing 729 terms in English and their translations to Bulgarian. The terms belong to the Computer domain. The terminological...
XML PDF ZIP (581 views) (478 Downloads)
-
Bilingual resource with Bulgarian strategic documents in the field of telecommunications and broadband (Bulgarian - English)
Bilingual collection of documents in the field of telecommunications and broadband, size on disk 440 kB, Bulgarian-English This dataset has been created within the framework of the...
ZIP (369 views) (258 Downloads)
-
DA-EN Danish Ministry of Higher Education and Science 3 (Processed)
Parallel texts Danish-English from the Danish Ministry of Higher Education and Science, size 110,000 words, topic: research policy (Processed) This dataset has been created within the...
ZIP (330 views) (216 Downloads)
-
Bilingual hr-en parallel corpus from the Journal of the Croatian Association of Civil Engineers website (Processed)
Contents of http://casopis-gradjevinar.hr were crawled, aligned on document and sentence level and converted into a parallel corpus This dataset has been created within the framework of...
ZIP (451 views) (357 Downloads)
-
Polish-English parallel corpus from the website of the National Centre for Research and Development (Processed)
Polish-English parallel corpus from the website of the National Centre for Research and Development (https://www.ncbr.gov.pl) This dataset has been created within the framework of the...
ZIP (532 views) (420 Downloads)
-
Polish-English parallel corpus from the website of the National Centre for Nuclear Research (Processed)
Polish-English parallel corpus from the website of the National Centre for Nuclear Research (https://www.ncbj.gov.pl/) This dataset has been created within the framework of the European...
ZIP (462 views) (338 Downloads)
-
Polish-English parallel corpus from the website "geoportal.gov.pl" (Processed)
Polish-English parallel corpus from the website "geoportal.gov.pl (https://www.geoportal.gov.pl) This dataset has been created within the framework of the European Language Resource...
ZIP (195 views) (146 Downloads)
-
Polish-English parallel corpus from the website of the Ministry of Science and Higher Education (Processed)
Polish-English parallel corpus from the website of the Ministry of Science and Higher Education (http://www.eng.nauka.gov.pl/en/) This dataset has been created within the framework of...
ZIP (358 views) (255 Downloads)
-
Bilingual English-Danish parallel corpus from Danish Ministry of Higher Education and Science website
Contents of https://ufm.dk were crawled, aligned on document and sentence level and converted into a parallel corpus. This dataset has been created within the framework of the European...
ZIP (423 views) (313 Downloads)
-
Polish-English parallel corpus from the website "Science in Poland" (Processed)
Polish-English parallel corpus from the website "Science in Poland" (https://scienceinpoland.pap.pl/en and https://naukawpolsce.pap.pl/) This dataset has been created within the...
ZIP (201 views) (157 Downloads)
-
Czech Association of Medical Physicists - Physics Glossary (Processed)
A dictionary of 3281 terms relating to physics for medicine in Czech - English This dataset has been created within the framework of the European Language Resource Coordination (ELRC)...
ZIP (377 views) (284 Downloads)
-
DA-EN Danish Ministry of Higher Education and Science 2 (Processed)
Parallel texts Danish-English from the Danish Ministry of Higher Education and Science, size 115,000 words, topic: research policy (Processed) This dataset has been created within the...
ZIP (302 views) (203 Downloads)
-
Bilingual resource with Bulgarian strategic documents in the field of telecommunications and broadband (Bulgarian - English) (Processed)
Bilingual collection of documents in the field of telecommunications and broadband, size on disk 440 kB, Bulgarian-English (Processed) This dataset has been created within the framework...
ZIP (387 views) (283 Downloads)
-
OROSSIMO Corpus - Computer Science
A corpus of academic discourse texts belonging to the Computer Science domain (according to the Dewey Decimal classification, DDC00 - Computer science, knowledge & systems), annotated...
ZIP (368 views) (256 Downloads)
-
DA-EN Danish Ministry of Higher Education and Science 3
Parallel texts Danish-English from the Danish Ministry of Higher Education and Science, size 110,000 words, topic: research policy This dataset has been created within the framework of...
ZIP (366 views) (264 Downloads)
-
Czech Association of Medical Physicists - Physics Glossary
A dictionary of 3746 terms relating to physics for medicine in Czech - English This dataset has been created within the framework of the European Language Resource Coordination (ELRC)...
XML PDF ZIP (481 views) (393 Downloads)
-
ANR translation memory containing major publications, as well as several administrative documents and news
Documents / language resources from ANR – Translation memory (.xliff) fr>en(uk) containing 9611 translation units (17 Mb) Major publications • Rapport d’activité 2014 (110...
XML PDF ZIP (717 views) (612 Downloads)
-
DA-EN Danish Ministry of Higher Education and Science 4 (Processed)
Parallel texts Danish-English from the Danish Ministry of Higher Education and Science, size 115,000 words, topis: research policy (Processed) This dataset has been created within the...
ZIP (318 views) (216 Downloads)