MARC: Building named entity recognition models for Croatian and Slovene

Building named entity recognition models for Croatian and Slovene

The paper presents efforts in developing freely available models for named entity recognition and classification for Croatian and Slovene. Our experiments focus on the most informative set of linguistic features taking into account the availability of language tools for the lan- guages in question....

Full description

Permalink:	http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:318228/Details
Matična publikacija:	Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference Ljubljana : 2012
Glavni autori:	Ljubešić, Nikola, informatičar (-), Stupar, Marija (Author), Jurić, Tereza
Vrsta građe:	Članak
Jezik:	eng


LEADER	02099naa a2200253uu 4500
008	131111s2012 xx 1 eng\|d
035			\|a (CROSBI)616808
040			\|a HR-ZaFF \|b hrv \|c HR-ZaFF \|e ppiak
100	1		\|9 445 \|a Ljubešić, Nikola, \|c informatičar
245	1	0	\|a Building named entity recognition models for Croatian and Slovene / \|c Ljubešić, Nikola ; Stupar, Marija ; Jurić, Tereza.
246	3		\|i Naslov na engleskom: \|a Building Named Entity Recognition Models For Croatian And Slovene
300			\|a 129-134 \|f str.
520			\|a The paper presents efforts in developing freely available models for named entity recognition and classification for Croatian and Slovene. Our experiments focus on the most informative set of linguistic features taking into account the availability of language tools for the lan- guages in question. Beside the classic linguistic features, distributional similarity features calculated from large unannotated monolingual corpora are exploited as well. Using distributional information improves the results for 7-8 points in F1 while adding morphological infor- mation improves the results for additional 3-4 points in both languages. The best performing models, along with test sets for comparison with future and existing systems and a HunPos part-of-speech model for Croatian are available for download for academic usage.
536			\|a Projekt MZOS \|f 130-1301679-1380
536			\|a Projekt MZOS \|f FP7-271022
546			\|a ENG
690			\|a 5.04
693			\|a named entity recognition, distributional similarity, Croatian language, Slovene language \|l hrv \|2 crosbi
693			\|a named entity recognition, distributional similarity, Croatian language, Slovene language \|l eng \|2 crosbi
700	1		\|a Stupar, Marija \|4 aut
700	1		\|a Jurić, Tereza \|4 aut
773	0		\|a Eighth LANGUAGE TECHNOLOGIES Conference (8.-9.10.2012. ; Ljubljana, Slovenija) \|t Proceedings of the Eighth LANGUAGE TECHNOLOGIES Conference \|d Ljubljana : 2012 \|n Erjavec, Tomaž ; Žganec Gros, Jerneja \|g str. 129-134
942			\|c RZB \|u 2 \|v Recenzija \|z Znanstveni - Predavanje - CijeliRad \|t 1.08
999			\|c 318228 \|d 318226

Building named entity recognition models for Croatian and Slovene

Slični primjerci