Lemmatization and morphosyntactic tagging of Croatian and Serbian

We investigate state-of-the-art statistical models for lemmatization and morphosyntactic tagging of Croatian and Serbian. The models stem from a new manually annotated SETIMES.HR corpus of Croatian, based on the SETimes parallel corpus. We train models on Croatian text and evaluate them on samples o...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:318385/Description
Matična publikacija: Proceedings of the 4th Biennial International Workshop on Balto-Slavic Natural Language Processing
Sofia, Bulgaria : Association for Computational Linguistics, 2013
Glavni autori: Agić, Željko (-), Merkler, Danijela (Author), Ljubešić, Nikola, informatičar
Vrsta građe: Članak
Jezik: eng
Online pristup: http://aclweb.org/anthology-new/W/W13/W13-2408.pdf