Building the Croatian-English Parallel Corpus

The contribution gives a survey of procedures and formats used in building the Croatian-English parallel corpus which is being collected in the Institute of Linguistics at the Philosophical Faculty, University of Zagreb. The primary text source is newspaper Croatia Weekly which has been published fr...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:310893/Details
Matična publikacija: Second International Conference on Language Resources and Evaluation LREC2000
Vol. I.
Glavni autor: Tadić, Marko (-)
Vrsta građe: Članak
Jezik: eng
LEADER 01762naa a2200229uu 4500
008 131111s2000 xx eng|d
020 |a 0000-00000-0 
035 |a (CROSBI)101837 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |a Tadić, Marko 
245 1 0 |a Building the Croatian-English Parallel Corpus /  |c Tadić, Marko. 
246 3 |i Naslov na engleskom:  |a Building the Croatian-English Parallel Corpus 
300 |a 523-530  |f str. 
520 |a The contribution gives a survey of procedures and formats used in building the Croatian-English parallel corpus which is being collected in the Institute of Linguistics at the Philosophical Faculty, University of Zagreb. The primary text source is newspaper Croatia Weekly which has been published from the beginning of 1998 by HIKZ (Croatian Institute for Information and Culture). After quick survey of existing English-Croatian parallel corpora, the article copes with procedures involved in text conversion and text encoding, particularly the alignment. There are several recent suggestions for alignment encoding and they are elaborated. Preliminary statistics on numbers of S and W elements in each language is given at the end of the article. 
536 |a Projekt MZOS  |f 130718 
546 |a ENG 
690 |a 6.03 
693 |a Alignment, Corpus Linguistics, Croatian, English, Parallel Corpus, XCES, XML  |l hrv  |2 crosbi 
693 |a Alignment, Corpus Linguistics, Croatian, English, Parallel Corpus, XCES, XML  |l eng  |2 crosbi 
773 0 |t Second International Conference on Language Resources and Evaluation LREC2000  |d Pariz-Atena : ELRA, 2000  |k Vol. I.  |n Gavrilidou, M., Carayannis, G., Markantonatou, S., Piperidis S..  |z 0-000-00000-0  |g str. 523-530 
942 |c POG  |t 1.16.1  |u 1  |z Znanstveni 
999 |c 310893  |d 310891