Domain Dependence of Statistical Named Entity Recognition and Classification in Croatian Texts

Influence of text domain selection on statistical named entity recognition and classification in Croatian texts is investigated. Two datasets of Croatian newspaper texts of differing text domains were manually annotated for named entities and used for training and testing the Stanford NER system for...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:318360/Details
Matična publikacija: Proceedings of the 35th International Conference on Information Technology Interfaces (ITI 2013)
Zagreb : SRCE University Computer Centre, University of Zagreb, 2013
Glavni autori: Agić, Željko (-), Bekavac, Božo (Author)
Vrsta građe: Članak
Jezik: eng
LEADER 02184naa a2200265uu 4500
008 131111s2013 xx 1 eng|d
035 |a (CROSBI)634488 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |9 495  |a Agić, Željko 
245 1 0 |a Domain Dependence of Statistical Named Entity Recognition and Classification in Croatian Texts /  |c Agić, Željko ; Bekavac, Božo. 
246 3 |i Naslov na engleskom:  |a Domain Dependence of Statistical Named Entity Recognition and Classification in Croatian Texts 
300 |a 277-283  |f str. 
520 |a Influence of text domain selection on statistical named entity recognition and classification in Croatian texts is investigated. Two datasets of Croatian newspaper texts of differing text domains were manually annotated for named entities and used for training and testing the Stanford NER system for named entity recognition based on sequence labeling with CRF. State of the art scores were observed in both domains. A strong preference for systems trained on mixed text domains is established by the experiment. The top- performing system was recorded with an overall F1- score of 0.876 on mixed-domain test sets, scoring 0.899 in one of the selected domains and 0.852 in the other. The single best domain F1-scores were recorded at 0.910 and 0.858. 
536 |a Projekt MZOS  |f 130-1300646-0645 
536 |a Projekt MZOS  |f 130-1300646-1776 
546 |a ENG 
690 |a 2.09 
690 |a 5.04 
690 |a 6.03 
693 |a text domain, domain dependence, named entity recognition, Croatian language  |l hrv  |2 crosbi 
693 |a text domain, domain dependence, named entity recognition, Croatian language  |l eng  |2 crosbi 
773 0 |a 35th International Conference on Information Technology Interfaces (ITI 2013) (24-27.06.2013. ; Cavtat, Hrvatska)  |t Proceedings of the 35th International Conference on Information Technology Interfaces (ITI 2013)  |d Zagreb : SRCE University Computer Centre, University of Zagreb, 2013  |n Lužar-Stiffler, Vesna ; Jarec, Iva  |x 1330-1012  |z 978-953-7138-30-1  |g str. 277-283 
700 1 |9 835  |a Bekavac, Božo  |4 aut 
942 |c RZB  |u 2  |v Recenzija  |z Znanstveni - Predavanje - CijeliRad  |t 1.08 
999 |c 318360  |d 318358