Implementation of Croatian NERC system
In this paper a system for Named Entity Recognition and Classification in Croatian language is described. The system is com-posed of the module for sentence segmen-tation, inflectional lexicon of common words, inflectional lexicon of names and regular local grammars for automatic rec-ognition of num...
Permalink: | http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:315184/Details |
---|---|
Matična publikacija: |
Proceedings of the Workshop on Balto-Slavonic Natural Language Processing 2007, Special Theme: Information Extraction and Enabling Technologies Prag : Association for Computational Linguistics (ACL), 2007 |
Glavni autori: | Bekavac, Božo (-), Tadić, Marko (Author) |
Vrsta građe: | Članak |
Jezik: | eng |
Online pristup: |
http://bib.irb.hr/datoteka/300859.BBMT4ACL2007_v4.pdf |
LEADER | 02686naa a2200289uu 4500 | ||
---|---|---|---|
005 | 20131205151522.0 | ||
008 | 131111s2007 xx 1 eng|d | ||
035 | |a (CROSBI)300859 | ||
040 | |a HR-ZaFF |b hrv |c HR-ZaFF |e ppiak | ||
100 | 1 | |9 835 |a Bekavac, Božo | |
245 | 1 | 0 | |a Implementation of Croatian NERC system / |c Bekavac, Božo ; Tadić, Marko. |
246 | 3 | |i Naslov na engleskom: |a Implementation of Croatian NERC system | |
300 | |a 11-18 |f str. | ||
520 | |a In this paper a system for Named Entity Recognition and Classification in Croatian language is described. The system is com-posed of the module for sentence segmen-tation, inflectional lexicon of common words, inflectional lexicon of names and regular local grammars for automatic rec-ognition of numerical and temporal expres-sions. After the first step (sentence segmen-tation), the system attaches to each token its full morphosyntactic description and appropriate lemma and additional tags for potential categories for names without dis-ambiguation. The third step (the core of the system) is the application of a set of rules for recognition and classification of named entities in already annotated texts. Rules based on described strategies (like internal and external evidence) are applied in cas-cade of transducers in defined order. Al-though there are other classification sys-tems for NEs, the results of our system are annotated NEs which are following MUC-7 specification. System is applied on infor-mative and noninformative texts and results are compared. F-measure of the system ap-plied on informative texts yields over 90%. | ||
536 | |a Projekt MZOS |f 036-1300646-1986 | ||
536 | |a Projekt MZOS |f 130-1300646-0645 | ||
536 | |a Projekt MZOS |f 130-1300646-1002 | ||
546 | |a ENG | ||
690 | |a 5.04 | ||
690 | |a 6.03 | ||
693 | |a named entity recognition and classification, Croatian, computational linguistics, information extraction |l hrv |2 crosbi | ||
693 | |a named entity recognition and classification, Croatian, computational linguistics, information extraction |l eng |2 crosbi | ||
773 | 0 | |a 45th Annual Meeting of the Association of Computational Linguistics (ACL 2007) (23-30.06.2007 ; Prag) |t Proceedings of the Workshop on Balto-Slavonic Natural Language Processing 2007, Special Theme: Information Extraction and Enabling Technologies |d Prag : Association for Computational Linguistics (ACL), 2007 |n Piskorski, Jakub ; Tanev, Hristo ; Pouliquen, Bruno ; Steinberger, Ralf |z 978-1-932432-88-6 |g str. 11-18 | |
700 | 1 | |9 888 |a Tadić, Marko |4 aut | |
856 | |u http://bib.irb.hr/datoteka/300859.BBMT4ACL2007_v4.pdf | ||
942 | |c RZB |u 1 |v Recenzija |z Znanstveni - Predavanje - CijeliRad | ||
999 | |c 315184 |d 315182 |