Comparing measures of semantic similarity
The aim of this paper is to compare different methods for automatic extraction of semantic similarity measures from corpora. The semantic similarity measure is proven to be very useful for many tasks in natural language processing like information retrieval, information extraction, machine translati...
Permalink: | http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:317007/Details |
---|---|
Matična publikacija: |
Proceedings of the 30th International Conference on Information Technology Interfaces Institute of Electrical and Electronics Engineers (IEEE), 2008 |
Glavni autori: | Ljubešić, Nikola, informatičar (-), Bakarić, Nikola (Author), Njavro, Jasmina, Boras, Damir |
Vrsta građe: | Članak |
Jezik: | eng |
Online pristup: |
http://bib.irb.hr/datoteka/507889.ljubesic08-comparing.pdf |
LEADER | 02328naa a2200265uu 4500 | ||
---|---|---|---|
008 | 131111s2008 xx 1 eng|d | ||
035 | |a (CROSBI)507889 | ||
040 | |a HR-ZaFF |b hrv |c HR-ZaFF |e ppiak | ||
100 | 1 | |9 445 |a Ljubešić, Nikola, |c informatičar | |
245 | 1 | 0 | |a Comparing measures of semantic similarity / |c Ljubešić, Nikola ; Boras, Damir ; Bakarić, Nikola ; Njavro, Jasmina. |
246 | 3 | |i Naslov na engleskom: |a Comparing measures of semantic similarity | |
300 | |a 675-682 |f str. | ||
520 | |a The aim of this paper is to compare different methods for automatic extraction of semantic similarity measures from corpora. The semantic similarity measure is proven to be very useful for many tasks in natural language processing like information retrieval, information extraction, machine translation etc. Additionally, one of the main problems in natural language processing is data sparseness since no language sample is large enough to seize all possible language combinations. In our research we experiment with four different measures of association with context and eight different measures of vector similarity. The results show that the Jensen-Shannon divergence and L1 and L2 norm outperform other measures of vector similarity regardless of the measure of association with context used. Maximum likelihood estimate and t-test show better results than other measures of association with context. | ||
536 | |a Projekt MZOS |f 130-1301679-1380 | ||
546 | |a ENG | ||
690 | |a 5.04 | ||
693 | |a calculating semantic similarity, context, association measures, similarity measures |l hrv |2 crosbi | ||
693 | |a calculating semantic similarity, context, association measures, similarity measures |l eng |2 crosbi | ||
700 | 1 | |a Bakarić, Nikola |4 aut | |
700 | 1 | |a Njavro, Jasmina |4 aut | |
700 | 1 | |9 418 |a Boras, Damir |4 aut | |
773 | 0 | |a 30th International Conference on Information Technology Interfaces (23-26.06.2008. ; Dubrovnik, Hrvatska) |t Proceedings of the 30th International Conference on Information Technology Interfaces |d Institute of Electrical and Electronics Engineers (IEEE), 2008 |n Hljuz Dobrić, Vesna |x 1330-1012 |z 978-953-7138-12-7 |g str. 675-682 | |
856 | |u http://bib.irb.hr/datoteka/507889.ljubesic08-comparing.pdf | ||
942 | |c RZB |u 2 |v Recenzija |z Znanstveni - Predavanje - CijeliRad |t 1.08 | ||
999 | |c 317007 |d 317005 |