Machine learning of the Bayesian belief network as a tool for evaluating the process frequency on social network data
Journal: Scientific and Technical Journal of Information Technologies, Mechanics and Optics (Vol.21, No. 5)Publication Date: 2021-10-21
Authors : ToropovaA.V. Abramov M.V. Tulupyeva T.V.;
Page : 727-737
Keywords : process frequency; frequency estimation; Bayesian belief networks; process episodes; stochastic process;
Abstract
The paper considers the problem of evaluating frequency of the processes whose mathematical model is stochastic processes consisting of a series of sequential episodes with a known class of distributions of the length of the time intervalbetweenthem. In thepreviouslyproposedapproach,theinputdataincludedinformation aboutthevalueof the interval between the last episode and the end of the study period, which could lead to inaccurate results. This interval differs from the intervals between successive episodes, and hence its presentation and processing require approaches that take this feature into account. Accuracy of the estimation results for process frequency was improved by developing anew modelbasedontheBayesianconfidencenetworkthatincludes nodes correspondingtotheintervals betweenthe last episodes of the process, the minimum and maximum intervals between episodes, by correctly accounting for the values of the interval between the last episode and the end of the study period at the model training stage. The authors propose a Bayesian belief network that includes a random element characterizing the interval between the end of the study period and the last episode of the process during the study period; data on this interval can be available at the training stage. They used R programming and the bnlearn package to model the Bayesian belief network. Anew approach to the estimation of process frequency based on the Bayesian belief network generated by machine learning methods is proposed. It allows increasing the accuracy of the results by correctly considering the value of the interval between the last episode and the end of the period under study using a special scheme in the machine learning Bayesian belief network which includes a “hypothetical” episode after the end of the study period. To test the proposed approach, data was collected on 5608 Instagram users, which included the time of posting for the year 2020 and the time of publishing the first post for the year 2021. 70 % of the sample was used to train the model, and 30 % was used to compare the posting frequency values predicted by the model with known values. The results can be used in various fields of science, where it is necessary to estimate a process frequency under information deficit, when the whole process is observed for no more than some limited time. Obtaining such estimates is often an important issue in medicine, epidemiology, sociology, etc. The approach shows good results on the comparison of the theoretical model and the results of learning from the social network data, which can automate the obtaining of process frequency estimates.
Other Latest Articles
- Generic programming with combinators and objects
- Pedagogical conditions for development of lyceum teachers’ professional culture in the methodical work system
- Automatic construction of the dialog tree based on unmarked text corpora in Russian
- Professional orientation: modern view and prospects of change
- Methods of development of andragogical competence of teachers
Last modified: 2021-10-21 20:02:29