Statistical machine translation of croatian weather forecasts: how much data do we need?
This research is the first step towards developing a system for translating Croatian weather forecasts into multiple languages. This step deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic, con- sisting of 7, 893...
Permalink: | http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:308956/Details |
---|---|
Matična publikacija: |
CIT. Journal of computing and information technology 18 (2010), 4 ; str. 303-308 |
Glavni autori: | Ljubešić, Nikola, informatičar (-), Boras, Damir (Author), Bago, Petra |
Vrsta građe: | Članak |
Jezik: | eng |
Online pristup: |
http://bib.irb.hr/datoteka/507907.ljubesic10a-statistical.pdf |
LEADER | 02084naa a2200289uu 4500 | ||
---|---|---|---|
008 | 131105s2010 xx eng|d | ||
022 | |a 1330-1136 | ||
024 | |2 doi |a 10.2498/cit.1001914 | ||
035 | |a (CROSBI)507907 | ||
040 | |a HR-ZaFF |b hrv |c HR-ZaFF |e ppiak | ||
100 | 1 | |9 445 |a Ljubešić, Nikola, |c informatičar | |
245 | 1 | 0 | |a Statistical machine translation of croatian weather forecasts: how much data do we need? / |c Ljubešić, Nikola ; Bago, Petra ; Boras, Damir. |
246 | 3 | |i Naslov na engleskom: |a Statistical Machine Translation of Croatian Weather Forecasts: How Much Data Do We Need? | |
300 | |a 303-308 |f str. | ||
363 | |a 18 |b 4 |i 2010 | ||
520 | |a This research is the first step towards developing a system for translating Croatian weather forecasts into multiple languages. This step deals with the Croatian-English language pair. The parallel corpus consists of a one-year sample of the weather forecasts for the Adriatic, con- sisting of 7, 893 sentence pairs. Evaluation is performed by the automatic evaluation measures BLUE, NIST and METEOR, as well as by manually evaluating a sample of 200 translations. We have shown that with a small- sized training set and the state-of-the art Moses system, decod- ing can be done with 96% accuracy concerning adequacy and fluency. Additional improvement is expected by increasing the training set size. Finally, the correlation of the recorded evaluation measures is explored. | ||
536 | |a Projekt MZOS |f 130-1301679-1380 | ||
546 | |a ENG | ||
690 | |a 5.04 | ||
693 | |a statistical machine translation, weather forecast, automatic evaluation, human evaluation |l hrv |2 crosbi | ||
693 | |a statistical machine translation, weather forecast, automatic evaluation, human evaluation |l eng |2 crosbi | ||
773 | 0 | |t CIT. Journal of computing and information technology |x 1330-1136 |g 18 (2010), 4 ; str. 303-308 | |
700 | 1 | |9 418 |a Boras, Damir |4 aut | |
700 | 1 | |9 474 |a Bago, Petra |4 aut | |
856 | |u http://bib.irb.hr/datoteka/507907.ljubesic10a-statistical.pdf | ||
942 | |c CLA |t 1.01 |u 2 |z Znanstveni - clanak | ||
999 | |c 308956 |d 308954 |