Parallel Global Voices (Greek - French)
Foilsitheoir
Tuairisc
Parallel Global Voices EL-FR is a parallel corpus generated from the Global Voices multilingual group of websites (http://globalvoices.org/), where volunteers publish and translate news stories in more than 40 languages. The original content from the Global Voices websites is available by the authors and publishers under a Creative Commons Attribution license. The original content was crawled in 2015-2016 and web documents were exported to XML by researchers at the Institute for Language and Speech Processing (http://www.ilsp.gr/). Crawled documents that were translations of each other were paired on the basis of their link information. After document pairing, sentence alignments were generated with the hunalign sentence aligner. This dataset contains one tmx file with alignments from 1945 el-fr document pairs.
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) actions SMART 2014/1074 and SMART 2015/1091. For further information on the project: http://lr-coordination.eu.
Fearainn Eurovoc
- Aitheantóir
- ELRC_262
- Leathanach lamairne
- http://data.europa.eu/euodp/en/data/dataset/elrc_262
- Dáta foilsithe
- 2017-12-14
- Dáta modhnaithe
- 2018-02-14
- Teanga
- Fraincis, Gréigis
- Catalogue
- European Union Open Data Portal