Treebank Translation for Cross-Lingual Parser Induction

Cross-lingual learning has become a popular approach to facilitate the development of resources and tools for low density languages. Its underlying idea is to make use of existing tools and annotations in resource-rich languages to create similar tools and resources for resource-poor languages. Typi...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:335577/Details
Matična publikacija: Proceedings of the Eighteenth Conference on Computational Natural Language Learning (CoNLL 2014)
Baltimore, Maryland, USA : Association for Computational Linguistics, 2014
Glavni autori: Tiedemann, Jörg (-), Nivre, Joakim (Author), Agić, Željko
Vrsta građe: Članak
Jezik: eng
Online pristup: http://bib.irb.hr/datoteka/701057.tiedemann2014-treebank.pdf
http://aclweb.org/anthology/W/W14/W14-1614.pdf
LEADER 02568naa a22002777i 4500
005 20170115202523.0
008 150109s2014 mdu 1 eng|d
035 |a (CROSBI)701057 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |a Tiedemann, Jörg 
245 1 0 |a Treebank Translation for Cross-Lingual Parser Induction /  |c Tiedemann, Jörg ; Agić, Željko ; Nivre, Joakim. 
246 3 |i Naslov na engleskom:  |a Treebank Translation for Cross-Lingual Parser Induction 
300 |a 130-140  |f str. 
520 |a Cross-lingual learning has become a popular approach to facilitate the development of resources and tools for low density languages. Its underlying idea is to make use of existing tools and annotations in resource-rich languages to create similar tools and resources for resource-poor languages. Typically, this is achieved by either projecting annotations across parallel corpora, or by transferring models from one or more source languages to a target language. In this paper, we explore a third strategy by using machine translation to create synthetic training data from the original source-side annotations. Specifically, we apply this technique to dependency parsing, using a cross-lingually unified treebank for adequate evaluation. Our approach draws on annotation projection but avoids the use of noisy source-side annotation of an unrelated parallel corpus and instead relies on manual treebank annotation in combination with statistical machine translation, which makes it possible to train fully lexicalized parsers. We show that this approach significantly outperforms delexicalized transfer parsing.% despite the error-prone translation step. 
536 |a Projekt MZOS  |f 130-1300646-1776 
546 |a ENG 
690 |a 5.04 
693 |a treebank translation, cross-lingual parsing, parser induction  |l hrv  |2 crosbi 
693 |a treebank translation, cross-lingual parsing, parser induction  |l eng  |2 crosbi 
700 1 |a Nivre, Joakim  |4 aut 
700 1 |9 495  |a Agić, Željko  |4 aut 
773 0 |a Eighteenth Conference on Computational Natural Language Learning (CoNLL 2014) (26-27.06.2014 ; Baltimore, SAD)  |t Proceedings of the Eighteenth Conference on Computational Natural Language Learning (CoNLL 2014)  |d Baltimore, Maryland, USA : Association for Computational Linguistics, 2014  |z 978-1-941643-02-0  |g str. 130-140 
856 |u http://bib.irb.hr/datoteka/701057.tiedemann2014-treebank.pdf 
856 |u http://aclweb.org/anthology/W/W14/W14-1614.pdf 
942 |c RZB  |u 2  |v Recenzija  |z Znanstveni - Predavanje - CijeliRad  |t 1.08 
999 |c 335577  |d 335574