Bootstrapping bilingual lexicons from comparable corpora for closely related languages
In this paper we present an approach to bootstrap a Croatian- Slovene bilingual lexicon from comparable news corpora from scratch, without relying on any external bilingual knowledge resource. Instead of using a dictionary to translate context vectors, we build a seed lexicon from identical words in...
Permalink: | http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:312925 |
---|---|
Matična publikacija: |
Text, Speech and Dialogue : 14th International Conference, TSD 2011, Pilsen, Czech Republic, September 1-5, 2011. : Proceedings Lecture Notes in Computer Science |
Glavni autori: | Ljubešić, Nikola, informatičar (-), Fišer, Darja (Author) |
Vrsta građe: | Članak |
Jezik: | eng |
Online pristup: |
http://www.springerlink.com/content/n5m86t5h212h2753/ |