From Digitisation Process to Terminological Digital Resources

Monolingual and multilingual terminology and collocation bases represent valuable additional electronic resources, which can be used in further research, in written communication and in everyday communication. Building of such resources can be supported by terminology extraction tools relying on sta...

Full description

Permalink: http://skupni.nsk.hr/Record/ffzg.KOHA-OAI-FFZG:318364/Details
Matična publikacija: Proceedings of the 36th International Convention MIPRO 2013
Rijeka : Croatian Society for Information and Communication Technology, Electronics and Microelectronics - MIPRO, 2013
Glavni autori: Seljan, Sanja (-), Dunđer, Ivan (Author), Gašpar, Angelina
Vrsta građe: Članak
Jezik: eng
LEADER 02366naa a2200241uu 4500
008 131111s2013 xx 1 eng|d
999 |c 318364  |d 318362 
035 |a (CROSBI)634807 
040 |a HR-ZaFF  |b hrv  |c HR-ZaFF  |e ppiak 
100 1 |9 430  |a Seljan, Sanja 
245 1 0 |a From Digitisation Process to Terminological Digital Resources /  |c Seljan, Sanja ; Dunđer, Ivan ; Gašpar, Angelina. 
246 3 |i Naslov na engleskom:  |a From Digitisation Process to Terminological Digital Resources 
300 |a xx-xx  |f str. 
520 |a Monolingual and multilingual terminology and collocation bases represent valuable additional electronic resources, which can be used in further research, in written communication and in everyday communication. Building of such resources can be supported by terminology extraction tools relying on statistical or language approaches, or on hybrid model, but require considerable human expertise in evaluation and final compilation. The paper describes the whole process: from digitisation of printed material, OCR techniques, sentence alignment and creation of translation memories, up to terminology extraction and evaluation. The performance of tools and applied methodology is assessed through standard statistical measures of precision, recall and F-measure. Experimental results are produced, deficiencies of semi-automatic statistical and linguistic system highlighted and recommendations for further research suggested. 
536 |a Projekt MZOS  |f 130-1300646-0909 
546 |a ENG 
690 |a 5.04 
693 |a digitization, term and collocation extraction, Multi-Word Unit (MWU), statistical and language approaches, evaluation, English, Croatian  |l hrv  |2 crosbi 
693 |a digitization, term and collocation extraction, Multi-Word Unit (MWU), statistical and language approaches, evaluation, English, Croatian  |l eng  |2 crosbi 
700 1 |a Dunđer, Ivan  |4 aut  |9 1065 
700 1 |a Gašpar, Angelina  |4 aut 
773 0 |a International Convention on Information and Communication Technology, Electronics and Microelectronics (20-24.05.2013. ; Opatija, Hrvatska)  |t Proceedings of the 36th International Convention MIPRO 2013  |d Rijeka : Croatian Society for Information and Communication Technology, Electronics and Microelectronics - MIPRO, 2013  |n Biljanović, P.  |g str. xx-xx 
942 |c RZB  |u 2  |v Recenzija  |z Znanstveni - Predavanje - CijeliRad  |t 1.08