Corpus Aligner (CorAl) Evaluation on English- Croatian Parallel Corpora

An increasing demand for new language resources of recent EU members and accessing countries has in turn initiated the development of different language tools and resources, such as alignment tools and corresponding translation memories for new languages pairs. The primary goal of this paper is to p...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:316709/Details
Matična publikacija: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC2010)
Valletta : European Language Resources Association, 2010
Glavni autori: Seljan, Sanja (-), Tadić, Marko (Author), Šnajder, Jan, Dalbelo Bašić, Bojana, Osmann, Vjekoslav, Agić, Željko
Vrsta građe: Članak
Jezik: eng
Online pristup: http://bib.irb.hr/datoteka/463848.599_Paper.pdf
http://www.lrec-conf.org/proceedings/lrec2010/pdf/599_Paper.pdf
LEADER 02814naa a2200361uu 4500
008 131111s2010 xx 1 eng|d
035 |a (CROSBI)463848 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |9 430  |a Seljan, Sanja 
245 1 0 |a Corpus Aligner (CorAl) Evaluation on English- Croatian Parallel Corpora /  |c Seljan, Sanja ; Tadić, Marko ; Agić, Željko ; Šnajder, Jan ; Dalbelo Bašić, Bojana ; Osmann, Vjekoslav. 
246 3 |i Naslov na engleskom:  |a Corpus Aligner (CorAl) Evaluation on English- Croatian Parallel Corpora 
300 |a 3481-3484  |f str. 
520 |a An increasing demand for new language resources of recent EU members and accessing countries has in turn initiated the development of different language tools and resources, such as alignment tools and corresponding translation memories for new languages pairs. The primary goal of this paper is to provide a description of a free sentence alignment tool CorAl (Corpus Aligner), developed at the Faculty of Electrical Engineering and Computing, University of Zagreb. The tool performs paragraph alignment at the first step of the alignment process, which is followed by sentence alignment. Description of the tool is followed by its evaluation. The paper describes an experiment with applying the CorAl aligner to a English-Croatian parallel corpus of legislative domain using metrics of precision, recall and F1- measure. Results are discussed and the concluding sections discuss future directions of CorAl development. 
536 |a Projekt MZOS  |f 036-1300646-1986 
536 |a Projekt MZOS  |f 130-1300646-0645 
536 |a Projekt MZOS  |f 130-1300646-0909 
536 |a Projekt MZOS  |f 130-1300646-1776 
546 |a ENG 
690 |a 2.09 
690 |a 5.04 
690 |a 6.03 
693 |a Corpus Aligner, Coral, English-Croatian Parallel Corpora  |l hrv  |2 crosbi 
693 |a Corpus Aligner, Coral, English-Croatian Parallel Corpora  |l eng  |2 crosbi 
700 1 |a Tadić, Marko  |4 aut 
700 1 |a Šnajder, Jan  |4 aut 
700 1 |a Dalbelo Bašić, Bojana  |4 aut 
700 1 |a Osmann, Vjekoslav  |4 aut 
700 1 |9 495  |a Agić, Željko  |4 aut 
773 0 |a The Seventh International Conference on Language Resources and Evaluation (19.-21.05.2010. ; Valletta, Malta)  |t Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC2010)  |d Valletta : European Language Resources Association, 2010  |n Calzolari, Nicoletta ; Choukri, Khalid ; Maegaard, Bente ; Mariani, Joseph ; Odjik, Jan ; Piperidis, Stelios ; Rosner, Mike ; Tapias, Daniel  |z 2-9517408-6-7  |g str. 3481-3484 
856 |u http://bib.irb.hr/datoteka/463848.599_Paper.pdf 
856 |u http://www.lrec-conf.org/proceedings/lrec2010/pdf/599_Paper.pdf 
942 |c RZB  |u 2  |v Recenzija  |z Znanstveni - Poster - CijeliRad  |t 1.08 
999 |c 316709  |d 316707