Comparing measures of semantic similarity

The aim of this paper is to compare different methods for automatic extraction of semantic similarity measures from corpora. The semantic similarity measure is proven to be very useful for many tasks in natural language processing like information retrieval, information extraction, machine translati...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:317007/Details
Matična publikacija: Proceedings of the 30th International Conference on Information Technology Interfaces
Institute of Electrical and Electronics Engineers (IEEE), 2008
Glavni autori: Ljubešić, Nikola, informatičar (-), Bakarić, Nikola (Author), Njavro, Jasmina, Boras, Damir
Vrsta građe: Članak
Jezik: eng
Online pristup: http://bib.irb.hr/datoteka/507889.ljubesic08-comparing.pdf
LEADER 02328naa a2200265uu 4500
008 131111s2008 xx 1 eng|d
035 |a (CROSBI)507889 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |9 445  |a Ljubešić, Nikola,   |c informatičar 
245 1 0 |a Comparing measures of semantic similarity /  |c Ljubešić, Nikola ; Boras, Damir ; Bakarić, Nikola ; Njavro, Jasmina. 
246 3 |i Naslov na engleskom:  |a Comparing measures of semantic similarity 
300 |a 675-682  |f str. 
520 |a The aim of this paper is to compare different methods for automatic extraction of semantic similarity measures from corpora. The semantic similarity measure is proven to be very useful for many tasks in natural language processing like information retrieval, information extraction, machine translation etc. Additionally, one of the main problems in natural language processing is data sparseness since no language sample is large enough to seize all possible language combinations. In our research we experiment with four different measures of association with context and eight different measures of vector similarity. The results show that the Jensen-Shannon divergence and L1 and L2 norm outperform other measures of vector similarity regardless of the measure of association with context used. Maximum likelihood estimate and t-test show better results than other measures of association with context. 
536 |a Projekt MZOS  |f 130-1301679-1380 
546 |a ENG 
690 |a 5.04 
693 |a calculating semantic similarity, context, association measures, similarity measures  |l hrv  |2 crosbi 
693 |a calculating semantic similarity, context, association measures, similarity measures  |l eng  |2 crosbi 
700 1 |a Bakarić, Nikola  |4 aut 
700 1 |a Njavro, Jasmina  |4 aut 
700 1 |9 418  |a Boras, Damir  |4 aut 
773 0 |a 30th International Conference on Information Technology Interfaces (23-26.06.2008. ; Dubrovnik, Hrvatska)  |t Proceedings of the 30th International Conference on Information Technology Interfaces  |d Institute of Electrical and Electronics Engineers (IEEE), 2008  |n Hljuz Dobrić, Vesna  |x 1330-1012  |z 978-953-7138-12-7  |g str. 675-682 
856 |u http://bib.irb.hr/datoteka/507889.ljubesic08-comparing.pdf 
942 |c RZB  |u 2  |v Recenzija  |z Znanstveni - Predavanje - CijeliRad  |t 1.08 
999 |c 317007  |d 317005