Statistical machine translation of Croatian weather forecast: How much data do we need?

This research is a first step towards a system for translating Croatian weather forecast into multiple languages. This steps deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic consisting of 7, 893 sentence pairs....

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:317008/Details
Matična publikacija: Proceedings of the ITI 2010 32nd International Conference on INFORMATION TECHNOLOGY INTERFACES
Zagreb : University Computing Centre, University of Zagreb, 2010
Glavni autori: Ljubešić, Nikola, informatičar (-), Boras, Damir (Author), Bago, Petra
Vrsta građe: Članak
Jezik: eng
Online pristup: http://bib.irb.hr/datoteka/507895.ljubesic10-statistical.pdf
LEADER 02198naa a2200253uu 4500
008 131111s2010 xx 1 eng|d
035 |a (CROSBI)507895 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |9 445  |a Ljubešić, Nikola,   |c informatičar 
245 1 0 |a Statistical machine translation of Croatian weather forecast: How much data do we need? /  |c Ljubešić, Nikola ; Bago, Petra ; Boras, Damir. 
246 3 |i Naslov na engleskom:  |a Statistical machine translation of Croatian weather forecast: How much data do we need? 
300 |a 91  |f str. 
520 |a This research is a first step towards a system for translating Croatian weather forecast into multiple languages. This steps deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic consisting of 7, 893 sentence pairs. Evaluation is performed by best known automatic evaluation measures BLUE, NIST and METEOR, as well as by evaluating manually a sample of 200 translations. In this research we have shown that with a small-sized training set and the state-of-the art Moses system, decoding can be done with 96% accuracy concerning adequacy and fluency. Additional improvement is to be expected by increasing the training set size. 
536 |a Projekt MZOS  |f 130-1301679-1380 
546 |a ENG 
690 |a 5.04 
693 |a statistical machine translation, weather forecast, automatic evaluation, human evaluation  |l hrv  |2 crosbi 
693 |a statistical machine translation, weather forecast, automatic evaluation, human evaluation  |l eng  |2 crosbi 
773 0 |a ITI 2010 32nd International Conference on Information Technology Interfaces (21.-24.06.2010. ; Cavtat / Dubrovnik, Hrvatska)  |t Proceedings of the ITI 2010 32nd International Conference on INFORMATION TECHNOLOGY INTERFACES  |d Zagreb : University Computing Centre, University of Zagreb, 2010  |n Luzar-Stiffler, V.  |x 1330-1012  |z 978-1-4244-5732-8  |g str. 91 
700 1 |9 418  |a Boras, Damir  |4 aut 
700 1 |9 474  |a Bago, Petra  |4 aut 
856 |u http://bib.irb.hr/datoteka/507895.ljubesic10-statistical.pdf 
942 |c RZB  |u 2  |v Recenzija  |z Znanstveni - Predavanje - CijeliRad  |t 1.08 
999 |c 317008  |d 317006