Automatic construction of the dialog tree based on unmarked text corpora in Russian
Journal: Scientific and Technical Journal of Information Technologies, Mechanics and Optics (Vol.21, No. 5)Publication Date: 2021-10-21
Authors : FeldinaE.A. MakhnytkinaO.V.;
Page : 709-719
Keywords : dialog tree; dialog system; machine learning; cluster analysis; natural language processing;
Abstract
In this paper, we propose a method for automatically determining the structure of the tree and the key topics of nodes in the process of building a dialog tree based on unmarked text corpora. Building a dialog tree is one of the timeconsuming tasks when creating an automatic dialog system and in most cases is performed on the basis of manual markup, which takes a lot of time and resources. The method of hierarchical clustering of dialogs takes into account the semantic proximity of messages, allows one to allocate a different number of nodes at each level of the hierarchy and limit the dialog tree in width and depth. The algorithm for constructing annotations of nodes of the dialog tree takes into account the hierarchy of topics by building thematic chains. The method is based on the complex use of natural language processing methods (tokenization, lemmatization, part-of-speech tagging, word embeddings, etc.), analysis of the main components to reduce the dimension and methods of cluster analysis. Experiments on constructing the structure of the dialog tree and annotating nodes have shown the great possibilities of the proposed method for constructing an automatic dialog tree. The recognition accuracy on the example of the reference dialog tree containing 13 nodes at the first level, 381 nodes at the second level and 299 nodes at the third level was 0.8, 0.7 and 0.5, respectively. Automatic construction of dialog trees can be in demand when developing automatic dialog systems and for improving the quality of generating answers to user questions.
Other Latest Articles
- Professional orientation: modern view and prospects of change
- Methods of development of andragogical competence of teachers
- Meta-feature selection method based on the Auto-sklearn framework
- Structural and functional model of the organization of professional training of pedagogical workers in postgraduate pedagogical education institutions in pandemic conditions
- An experimental methodology for assessing the probability and danger of network attacks in automated systems
Last modified: 2021-10-21 19:59:47