Comparative Analysis of Automatic Term and Collocation Extraction

Monolingual and multilingual terminology and collocation bases, covering a specific domain, used independently or integrated with other resources, have become a valuable electronic resource. Building of such resources could be assisted by automatic term extraction tools, combining statistical and li...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:316504/Details
Matična publikacija: 2nd international conference The future of information sciences (INFuture 2009) : Digital resources and knowledge sharing
Zagreb : Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, 2009
Glavni autori: Seljan, Sanja (-), Dalbelo Bašić, Bojana (Author), Šnajder, Jan, Delač, Davor, Šamec-Gjurin, Matija, Crnec, Dina
Vrsta građe: Članak
Jezik: eng
LEADER 02497naa a2200301uu 4500
008 131111s2009 xx 1 eng|d
035 |a (CROSBI)440555 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |9 430  |a Seljan, Sanja 
245 1 0 |a Comparative Analysis of Automatic Term and Collocation Extraction /  |c Seljan, Sanja ; Dalbelo Bašić, Bojana ; Šnajder, Jan ; Delač, Davor ; Šamec-Gjurin, Matija ; Crnec, Dina. 
246 3 |i Naslov na engleskom:  |a Comparative Analysis of Automatic Term and Collocation Extraction 
300 |a 219-228  |f str. 
520 |a Monolingual and multilingual terminology and collocation bases, covering a specific domain, used independently or integrated with other resources, have become a valuable electronic resource. Building of such resources could be assisted by automatic term extraction tools, combining statistical and linguistic approaches. In this paper, the research on term extraction from monolingual corpus is presented. The corpus consists of publicly accessible English legislative documents. In the paper, results of two hybrid approaches are compared: extraction using the TermeX tool and an automatic statistical extraction procedure followed by linguistic filtering through the open source linguistic engineering tool. The results have been elaborated through statistical measures of precision, recall, and F-measure. 
536 |a Projekt MZOS  |f 036-1300646-1986 
536 |a Projekt MZOS  |f 130-1300646-0909 
546 |a ENG 
690 |a 5.04 
690 |a 6.03 
693 |a automatic extraction, term and collocation base, English language, evaluation metrics  |l hrv  |2 crosbi 
693 |a automatic extraction, term and collocation base, English language, evaluation metrics  |l eng  |2 crosbi 
700 1 |a Dalbelo Bašić, Bojana  |4 aut 
700 1 |a Šnajder, Jan  |4 aut 
700 1 |a Delač, Davor  |4 aut 
700 1 |a Šamec-Gjurin, Matija  |4 aut 
700 1 |a Crnec, Dina  |4 aut 
773 0 |a International Conference The Future of Information Sciences (2 ; 2009) (04.-06.11.2009. ; Zagreb, Hrvatska)  |t 2nd international conference The future of information sciences (INFuture 2009) : Digital resources and knowledge sharing  |d Zagreb : Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, 2009  |n Stančić, H. ; Seljan, S. ; Bawden, D. ; Lasić-Lazić, J. ; Slavić, A.  |z 978-953-175-355-5  |g str. 219-228 
942 |c RZB  |u 2  |v Recenzija  |z Znanstveni - Poster - CijeliRad  |t 1.08 
999 |c 316504  |d 316502