Pseudo-lemmatization in Croatian-English SMT

One of the first difficulties in conducting a thorough analysis of statistical machine translation involving Croatian as a morphologically rich and resource poor language is the lack of quality language resources. This paper presents results of two standard fourteen feature Croatian-English phrase-bas...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:335502/Details
Matična publikacija: Proceedings of the Central European Conference on Information and Intelligent Systems
Varaždin : Faculty of Organization and Informatics, University of Zagreb, 2014.
Glavni autori: Brkic, Marija (-), Matetic, Maja (Author), Seljan, Sanja
Vrsta građe: Članak
Jezik: eng
LEADER 02068naa a22002777i 4500
005 20150123155718.0
008 150109s2014 ci 1 eng|d
035 |a (CROSBI)719183 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |a Brkic, Marija 
245 1 0 |a Pseudo-lemmatization in Croatian-English SMT /  |c Brkic, Marija ; Matetic, Maja ; Seljan, Sanja. 
246 3 |i Naslov na engleskom:  |a Pseudo-lemmatization in Croatian-English SMT 
300 |a 242-249  |f str. 
520 |a One of the first difficulties in conducting a thorough analysis of statistical machine translation involving Croatian as a morphologically rich and resource poor language is the lack of quality language resources. This paper presents results of two standard fourteen feature Croatian-English phrase-based statistical machine translation systems. Prior to building the second system a partial pseudo-lemmatization of the Croatian parts of training and test sets is made in an attempt to simplify the translation process. Besides automatic evaluation, a manual evaluation is conducted in order to gain insight into the nature of the translation differences achieved between the two systems. 
536 |a Projekt MZOS  |f 13.13.1.3.03 
536 |a Projekt MZOS  |f 130-1300646-0909 
546 |a ENG 
690 |a 2.09 
690 |a 5.04 
693 |a phrase-based statistical machine translation, pseudolemmatization, Croatian-English  |l hrv  |2 crosbi 
693 |a phrase-based statistical machine translation, pseudolemmatization, Croatian-English  |l eng  |2 crosbi 
700 1 |a Matetic, Maja  |4 aut 
700 1 |a Seljan, Sanja  |4 aut  |9 430 
773 0 |a Central European Conference on Information and Intelligent Systems (17.-19.09.2014. ; Varaždin, Croatia)  |t Proceedings of the Central European Conference on Information and Intelligent Systems  |d Varaždin : Faculty of Organization and Informatics, University of Zagreb, 2014.  |n Hunjak, T. ; Lovrenčić, S. ; Tomičić, I.  |x 1847-2001  |g str. 242-249 
942 |c RZB  |u 2  |v Recenzija  |z Znanstveni - Predavanje - CijeliRad  |t 1.08 
999 |c 335502  |d 335499