Resources for Language Technologies
-
Bilingual collection of documents about the Cyprus Problem (Processed)
A parallel corpus(Greek-English) regarding the Cyprus Problem. This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe...
ZIP (391 amharc) (275 Íoslódálacha)
-
Polish Ministry of Foreign Affairs reports in EN and PL (Processed)
The dataset comprises the EN and PL versions of two reports created by the Polish Ministry of Foreign Affairs, “Rules for communicating the POLSKA brand” and “Polish Presidency of the...
ZIP (407 amharc) (303 Íoslódálacha)
-
Parallel Corpus from the Web Site of the the MFA of Latvia (Processed)
The Corpus has been built from the News and Press Releases published in the Web Site of the Ministry of Foreign Affairs of the Republic of Latvia. (Processed) This dataset has been...
ZIP (401 amharc) (290 Íoslódálacha)
-
Croatian-English parallel corpus from the website of the Embassy of Finland, Zagreb (Processed)
Croatian-English parallel corpus from the website of the Embassy of Finland, Zagreb (http://www.finland.hr) This dataset has been created within the framework of the European Language...
ZIP (363 amharc) (272 Íoslódálacha)
-
Hellenic Ministry of Foreign Affairs Greek-English announcements corpus (Processed)
The Hellenic Ministry of Foreign Affairs Greek-English announcements corpus contains announcements from the Hellenic Ministry of Foreign Affairs. This dataset has been created within the...
ZIP (578 amharc) (485 Íoslódálacha)
-
Memorandum for a ESM programme (Processed)
Memorandum of Understanding for a three-year European Stability Mechanism programme This dataset has been created within the framework of the European Language Resource Coordination...
ZIP (357 amharc) (254 Íoslódálacha)
-
English-Swedish parallel corpus from the Annual Overview of Sweden’s Official aid Agency SIDA Activities (Processed)
Source PDF files as parallel documents. The original texts are all always Swedish, the English text is its translation. This dataset has been created within the framework of the European...
ZIP (360 amharc) (263 Íoslódálacha)
-
Polish-English parallel corpus from the website "Polish Aid" (Processed)
Polish-English parallel corpus from the website of the website "Polish Aid" (http://www.polskapomoc.gov.pl) This dataset has been created within the framework of the European Language...
ZIP (339 amharc) (233 Íoslódálacha)
-
Polish-English parallel corpus from the website of the U.S. EMBASSY and CONSULATE IN POLAND (Processed)
Polish-English parallel corpus from the website of the U.S. EMBASSY and CONSULATE IN POLAND (https://pl.usembassy.gov/) This dataset has been created within the framework of the European...
ZIP (385 amharc) (270 Íoslódálacha)
-
Parallel corpus from Estonian Ministry of Foreign Affairs
Parallel corpus from content of Estonian Ministry of Foreign Affairs website This dataset has been created within the framework of the European Language Resource Coordination (ELRC)...
ZIP (346 amharc) (236 Íoslódálacha)
-
Polish Ministry of Foreign Affairs Regional Dataset (Processed)
A collection of Polish-English whitepapers published by the Polish Ministry of Foreign Affairs, including "Eastern Partnership" (10K words in 492 segments) and "Poland's 10 years in the...
ZIP (529 amharc) (417 Íoslódálacha)
-
Polish-English parallel corpus from the website of the Ministry of National Defence (Processed)
Polish-English parallel corpus from the website of the Ministry of National Defence, Republic of Poland (http://www.mon.gov.pl) This dataset has been created within the framework of the...
ZIP (342 amharc) (227 Íoslódálacha)
-
Bilingual English-Finnish parallel corpus from the official Nordic cooperation website
Contents of the Nordic Co-operation web site http://www.norden.org downloaded and converted into a parallel corpus This dataset has been created within the framework of the European...
ZIP (530 amharc) (412 Íoslódálacha)
-
Bilingual documents Bulgarian-English in the field of transport (Processed)
Bilingual Bulgarian-English collection of documents; 549 KB (Processed) This dataset has been created within the framework of the European Language Resource Coordination (ELRC)...
ZIP (547 amharc) (428 Íoslódálacha)
-
Bilingual hr-en parallel corpus from Croatian Mine Action website (Processed)
Contents of http://www.hcr.hr website downloaded, aligned on document and segment level and converted into parallel corpus This dataset has been created within the framework of the...
ZIP (397 amharc) (297 Íoslódálacha)
-
Croatian-English parallel corpus from the website of the Ministry of Foreign and European Affairs, Republic of Croatia (Processed)
Croatian-English parallel corpus from the website of the Ministry of Foreign and European Affairs, Republic of Croatia (http://www.mvep.hr) This dataset has been created within the...
ZIP (420 amharc) (310 Íoslódálacha)
-
Croatian-English parallel corpus from the website of the Government Office for Cooperation with NGOs (Processed)
Croatian-English parallel corpus from the website of the Government Office for Cooperation with NGOs (https://udruge.gov.hr/) This dataset has been created within the framework of the...
ZIP (739 amharc) (634 Íoslódálacha)
-
Polish-English parallel corpus from the website of the Ministry of Regional Development (Processed)
Polish-English parallel corpus from the website of the Ministry of Regional Development (https://www.eog.gov.pl) This dataset has been created within the framework of the European...
ZIP (408 amharc) (311 Íoslódálacha)
-
Greek-English parallel corpus from the website of the Prime Minister of the Hellenic Republic (Processed)
Greek-English parallel corpus from the website of the Prime Minister of the Hellenic Republic https://primeminister.gr/ This dataset has been created within the framework of the European...
ZIP (479 amharc) (370 Íoslódálacha)
-
Polish-English parallel corpus from the website of the Ministry of Foreign Affairs (Processed)
Polish-English parallel corpus from the website of the Ministry of Foreign Affairs, Republic of Poland (https://mfa.gov.pl/en/) This dataset has been created within the framework of the...
ZIP (389 amharc) (284 Íoslódálacha)