Work packages
WP1
Management
This WP covers both the administrative aspect of the PRESEMT project, i.e. coordinating activities, monitoring of work progress, reporting to the community, managing of financial aspects etc., as well as the technical one, namely, monitoring of technical issues, work quality, technical decisions to be made etc.
WP2
System specifications
The current WP involves defining the guidelines, on the basis of which PRESEMT will be developed, i.e. defining the specifications of the system prototype and deciding upon the modules which this prototype will comprise. Furthermore, the consortium will identify the data and test suites required for validating and evaluating the PRESEMT prototype.
WP3
Corpus extraction & processing algorithms
WP3 involves the development of three modules of the PRESEMT prototype: (a) the Corpus creation & annotation module, released in 3 different versions, which will be responsible for the collection of resources over the web and their appropriate annotation, (b) the Phrase aligner module, released in 2 different versions, which, by consulting a small parallel corpus, will automatically define phrasing models in a given language pair, and (c) the Corpus modelling module, released in 2 different versions, which will identify semantic relations between words.
WP4
Structure selection
WP4 involves the development of the module, released in 2 different versions, which will handle the first phase of the translation process. Furthermore, WP4 involves the optimisation of the parameters of the specific module.
WP5
Translation equivalent selection
WP5 involves the development of the module, released in 2 different versions, which will handle the second phase of the translation process. Furthermore, WP5 involves the optimisation of the parameters of the specific module.
WP6
Post-processing & User adaptation
WP6 involves the development of two modules, namely (a) the Post-processing module, via which the end user will be able to correct the system output, and (b) the User adaptation module, where the focus is to make the system ‘learn’ from the user’s modifications.
WP7
Integration
Within WP7 the various modules developed in the previous WPs will be integrated into one prototype, issued in 3 subsequent versions, while the performance of the prototype will be enhanced through parallelisation processes. Furthermore, all versions of the system prototype will be accompanied by the respective documents, comprising system documentation and user manuals.
WP8
Dissemination
The current work package involves the development of a dissemination and exploitation strategy to be followed during the project lifecycle together with the relevant activities instantiating the aforementioned strategy.
WP9
Validation & Evaluation
WP9 encloses all the experimental activities to be performed with the purpose of (a) validating the system prototypes in terms of technical requirements and evaluating its performance in terms of translation quality. The validation and evaluation experiments, both consortium-internal and consortium-external, are planned to take place twice during the project lifecycle, following the issuing of the two versions of the system prototype.
The language pairs to be studied and used for evaluation purposes are the following:
- {Czech, Greek, German, Norwegian} --> English
- {Czech, Greek, English, Norwegian} --> German
Besides these activities, the consortium also plans to assess the system extensibility and portability to new languages, by applying the 2nd system prototype to other language pairs (cf. the following list), different from those used for the system development. The outcome of this task will also contribute to the issuing of the PRESEMT final prototype.
- {Czech, Greek, German, English, Norwegian} --> Italian
Workplan
PRESEMT will have a 36-month duration. The proposed work plan for reaching the project objectives is analysed into nine (9) work packages relating to five aspects, namely project management (WP1), dissemination activities (WP8), system specifications (WP2), system development & integration (WP3-WP7) and validation & evaluation (WP9).
Within PRESEMT an iterative development approach will be followed, concerning both the individual system modules and the system as a whole. This approach entails the creation of intermediate system prototypes, which will incorporate the results of the repetitive application of validation and evaluation activities. This will allow, to a great extent, to effectively address any critical issues that may emerge during development and to adopt well-planned solutions.
Within the timeframe proposed, broadly two development phases have been planned, each of them resulting in a system prototype (PRESEMT Prototype (ver.1) & PRESEMT Prototype (ver.2)). Both prototypes will be developed in accordance to the design principles and specifications defined in WP2.
The first system prototype, due on month 19, will include the first versions of the modules developed in WP3-WP6, and will be subsequently validated & evaluated in terms of performance and translation quality. The testing results will be fed back into the module development process to support the system improvement as it proceeds towards the second prototype.
The second system prototype, due on month 26, will include the final versions of the aforementioned modules, while parallelisation of processing will have been completed as well. Then, the second validation & evaluation iteration will take place to check the efficiency of the improvements performed.
The second testing iteration will be further enhanced via an assessment / experimentation phase, when the handling of other language pairs by the system will be investigated, leading to the final system prototype (PRESEMT Final Prototype) at the end of the project lifetime.