Comparative Analysis of Automatic Term and Collocation Extraction
Monolingual and multilingual terminology and collocation bases, covering a specific domain, used independently or integrated with other resources, have become a valuable electronic resource. Building of such resources could be assisted by automatic term extraction tools, combining statistical and li...
Permalink: | http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:316504/Details |
---|---|
Matična publikacija: |
2nd international conference The future of information sciences (INFuture 2009) : Digital resources and knowledge sharing Zagreb : Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, 2009 |
Glavni autori: | Seljan, Sanja (-), Dalbelo Bašić, Bojana (Author), Šnajder, Jan, Delač, Davor, Šamec-Gjurin, Matija, Crnec, Dina |
Vrsta građe: | Članak |
Jezik: | eng |
LEADER | 02497naa a2200301uu 4500 | ||
---|---|---|---|
008 | 131111s2009 xx 1 eng|d | ||
035 | |a (CROSBI)440555 | ||
040 | |a HR-ZaFF |b hrv |c HR-ZaFF |e ppiak | ||
100 | 1 | |9 430 |a Seljan, Sanja | |
245 | 1 | 0 | |a Comparative Analysis of Automatic Term and Collocation Extraction / |c Seljan, Sanja ; Dalbelo Bašić, Bojana ; Šnajder, Jan ; Delač, Davor ; Šamec-Gjurin, Matija ; Crnec, Dina. |
246 | 3 | |i Naslov na engleskom: |a Comparative Analysis of Automatic Term and Collocation Extraction | |
300 | |a 219-228 |f str. | ||
520 | |a Monolingual and multilingual terminology and collocation bases, covering a specific domain, used independently or integrated with other resources, have become a valuable electronic resource. Building of such resources could be assisted by automatic term extraction tools, combining statistical and linguistic approaches. In this paper, the research on term extraction from monolingual corpus is presented. The corpus consists of publicly accessible English legislative documents. In the paper, results of two hybrid approaches are compared: extraction using the TermeX tool and an automatic statistical extraction procedure followed by linguistic filtering through the open source linguistic engineering tool. The results have been elaborated through statistical measures of precision, recall, and F-measure. | ||
536 | |a Projekt MZOS |f 036-1300646-1986 | ||
536 | |a Projekt MZOS |f 130-1300646-0909 | ||
546 | |a ENG | ||
690 | |a 5.04 | ||
690 | |a 6.03 | ||
693 | |a automatic extraction, term and collocation base, English language, evaluation metrics |l hrv |2 crosbi | ||
693 | |a automatic extraction, term and collocation base, English language, evaluation metrics |l eng |2 crosbi | ||
700 | 1 | |a Dalbelo Bašić, Bojana |4 aut | |
700 | 1 | |a Šnajder, Jan |4 aut | |
700 | 1 | |a Delač, Davor |4 aut | |
700 | 1 | |a Šamec-Gjurin, Matija |4 aut | |
700 | 1 | |a Crnec, Dina |4 aut | |
773 | 0 | |a International Conference The Future of Information Sciences (2 ; 2009) (04.-06.11.2009. ; Zagreb, Hrvatska) |t 2nd international conference The future of information sciences (INFuture 2009) : Digital resources and knowledge sharing |d Zagreb : Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, 2009 |n Stančić, H. ; Seljan, S. ; Bawden, D. ; Lasić-Lazić, J. ; Slavić, A. |z 978-953-175-355-5 |g str. 219-228 | |
942 | |c RZB |u 2 |v Recenzija |z Znanstveni - Poster - CijeliRad |t 1.08 | ||
999 | |c 316504 |d 316502 |