Automatic Generation of Association Thesaurus Based on Domain-Specific Text Collection
Proceeding: 10th International Academic Conference (IAC)Publication Date: 2014-06-03
Authors : Nugumanova Aliya; Issabaeva Dinara; Baiburin Yerzhan;
Page : 529-538
Keywords : LSA; thesaurus; chi-square test; graph;
Abstract
The given work examines distributive approach for automatic generation of the associative thesauri of a definite domain. Distributive approach is based on assumption that presence of associative link among terms of the domain is defined by the statistics of their co-occurence in thematically related discources. The advantage of distributive approach is defined by the fact that it uses raw basic material (for example collection of documents of the domain) and it does not use additional knowledge about the domain. Distributive approach is supported only by mathematical apparatus of statistics and does not take into account neither lexical nor semantic information, that is why this approach let cover extensive lexical space of terms. However it leads to the main shortcoming of the approach, i.e. it produces excessive amount of “unnecessary” links among words which are less informative from utilitarian point of view. For solving set problems in the given work it is suggested to use special approach represented by combination of methods of distributive statistics, latent semantic analysis and graph theory.
Other Latest Articles
- Exploring Professional Intervention and Support for Breastfeeding Practices in the Primary Health Care Centers in Qatar: A Mixed-Method Study
- Time Lens perspective for Assembly Type Manufactures
- THE INCIDENTS OF VIOLANCE TO WOMAN AND ABORTION ACCORDING TO COURT RECORDS OF KONYA AT THE FIRST HALF OF XVIII. CENTURY
- A longitudinal study on the factors of destination image, destination attraction and destination loyalty
- Now Everyone can Measure Grammar Ability through the Use of Grammar Assessment System
Last modified: 2015-03-07 19:44:21