Error analysis in Croatian morphosyntactic tagging
In this paper, we provide detailed insight on properties of errors generated by a stochastic morphosyntactic tagger assigning Multext-East morphosyntactic descriptions to Croatian texts. Tagging the Croatia Weekly newspaper corpus by the CroTag tagger in stochastic mode revealed that approximately 8...
Permalink: | http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:315892/Details |
---|---|
Matična publikacija: |
Proceedings of the 31st International Conference on Information Technology Interfaces Zagreb : SRCE University Computer Centre, University of Zagreb, 2009 |
Glavni autori: | Agić, Željko (-), Tadić, Marko (Author), Dovedan Han, Zdravko |
Vrsta građe: | Članak |
Jezik: | eng |
Online pristup: |
http://bib.irb.hr/datoteka/389315.zamtzd_iti09.pdf |
LEADER | 02369naa a2200277uu 4500 | ||
---|---|---|---|
008 | 131111s2009 xx 1 eng|d | ||
035 | |a (CROSBI)389315 | ||
040 | |a HR-ZaFF |b hrv |c HR-ZaFF |e ppiak | ||
100 | 1 | |9 495 |a Agić, Željko | |
245 | 1 | 0 | |a Error analysis in Croatian morphosyntactic tagging / |c Agić, Željko ; Tadić, Marko ; Dovedan, Zdravko. |
246 | 3 | |i Naslov na engleskom: |a Error Analysis in Croatian Morphosyntactic Tagging | |
300 | |a 521-526 |f str. | ||
520 | |a In this paper, we provide detailed insight on properties of errors generated by a stochastic morphosyntactic tagger assigning Multext-East morphosyntactic descriptions to Croatian texts. Tagging the Croatia Weekly newspaper corpus by the CroTag tagger in stochastic mode revealed that approximately 85 percent of all tagging errors occur on nouns, adjectives, pronouns and verbs. Moreover, approximately 50 percent of these are shown to be incorrect assignments of case values. We provide various other distributional properties of errors in assigning morphosyntactic descriptions for these and other parts of speech. On the basis of these properties, we propose rule- based and stochastic strategies which could be integrated in the tagging module, creating a hybrid procedure in order to raise overall tagging accuracy for Croatian. | ||
536 | |a Projekt MZOS |f 130-1300646-0645 | ||
536 | |a Projekt MZOS |f 130-1300646-1776 | ||
546 | |a ENG | ||
690 | |a 5.04 | ||
690 | |a 6.03 | ||
693 | |a morphosyntactic tagging, part-of-speech tagging, error analysis, error distribution, Croatian language, hybrid tagging |l hrv |2 crosbi | ||
693 | |a morphosyntactic tagging, part-of-speech tagging, error analysis, error distribution, Croatian language, hybrid tagging |l eng |2 crosbi | ||
773 | 0 | |a 31st International Conference on Information Technology Interfaces (ITI 2009) (22-25.06.2009. ; Cavtat, Hrvatska) |t Proceedings of the 31st International Conference on Information Technology Interfaces |d Zagreb : SRCE University Computer Centre, University of Zagreb, 2009 |n Lužar-Stiffler, Vesna ; Jarec, Iva ; Bekić, Zoran |z 978-953-7138-16-5 |g str. 521-526 | |
700 | 1 | |9 888 |a Tadić, Marko |4 aut | |
700 | 1 | |9 415 |a Dovedan Han, Zdravko |4 aut | |
856 | |u http://bib.irb.hr/datoteka/389315.zamtzd_iti09.pdf | ||
942 | |c RZB |u 2 |v Recenzija |z Znanstveni - Predavanje - CijeliRad |t 1.08 | ||
999 | |c 315892 |d 315890 |