EPICG. The European Parliament interpreting corpus Ghent. A step further

Begin - Einde 
2016 - 2017 (lopend)
Vakgroep Vertalen, Tolken en Communicatie



The general objective of the project is to enlarge the interpreting corpus, which is being compiled at Ghent University and to prepare the data in such a way that they can be made available and searchable through a web interface. The project will serve as a kind of pilot to lay down procedures that can be applied on a far larger scale in the framework of more substantial research grants, in particular through the Hercules Programme.

The project includes several detailed objectives: (1) the conversion of existing audio files and corpus data to a new format called EXMARaLDA, developed at the University of Hamburg; (2) the collection of new data, both audio files to be downloaded from the European Parliament's website and transcriptions to be made by trained transcribers; (3) the alignment of source and target texts, of the speech signals of source and target text and, finally of source and target texts with their respective speech signals; (4) the annotation of the corpus for time and part of speech and its partial syntactic parsing; (5) the storage of the corpus data and the corpus management programme on a dedicated platform for the dissemination of corpus data.