Procedures in Building the Croatian-English Parallel Corpus

This contribution gives a survey of procedures and formats used in building the Croatian-English parallel corpus which is being collected at the Institute of Linguistics at the Philosophical Faculty, University of Zagreb. The primary text source is the newspaper Croatia Weekly which has been publish...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:305459/Details
Matična publikacija: International journal of corpus linguistics
6 (2001), special issue ; str. 107-123
Glavni autor: Tadić, Marko (-)
Vrsta građe: Članak
Jezik: eng
LEADER 01728naa a2200241uu 4500
008 131105s2001 xx eng|d
022 |a 1384-6655 
035 |a (CROSBI)101936 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |a Tadić, Marko 
245 1 0 |a Procedures in Building the Croatian-English Parallel Corpus /  |c Tadić, Marko. 
246 3 |i Naslov na engleskom:  |a Procedures in Building the Croatian-English Parallel Corpus 
300 |a 107-123  |f str. 
363 |a 6  |b special issue  |i 2001 
520 |a This contribution gives a survey of procedures and formats used in building the Croatian-English parallel corpus which is being collected at the Institute of Linguistics at the Philosophical Faculty, University of Zagreb. The primary text source is the newspaper Croatia Weekly which has been published from the beginning of 1998 by HIKZ (Croatian Institute for Information and Culture). After a quick survey of existing English-Croatian parallel corpora, the article copes with procedures involved in text conversion and text encoding, particularly the alignment. There are several recent suggestions for alignment encoding, and they are listed and elaborated at the end of the article. 
536 |a Projekt MZOS  |f 130718 
546 |a ENG 
690 |a 6.03 
693 |a corpus linguistics, parallel corpora, Croatian language, English language, corpus encoding, alignment, CES, XML  |l hrv  |2 crosbi 
693 |a corpus linguistics, parallel corpora, Croatian language, English language, corpus encoding, alignment, CES, XML  |l eng  |2 crosbi 
773 0 |t International journal of corpus linguistics  |x 1384-6655  |g 6 (2001), special issue ; str. 107-123 
942 |c CLA  |t 1.01  |u 1  |z Znanstveni - clanak 
999 |c 305459  |d 305457