ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

WEB USER PROFILE IMPROVISATION BY SAMPLING SITE STYLE TREE WITH DOM STRUCTURE AND NEURAL NETWORK

Journal: International Journal of Advanced Research in Engineering and Technology (IJARET) (Vol.11, No. 10)

Publication Date:

Authors : ;

Page : 161-170

Keywords : Noisy data; Site Style Tree; DOM; Neural Network; Back Propagation;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

In the present generation, the web domain is the most rising and open knowledge medium. The Web Domain consists of contents for numerous areas, including multimedia, organized, semi-structured and unstructured data, which are accessible on the web to users with knowledge that is relevant. But within a given application only part of the information is relevant, and the remainder of the information is regarded as noises. Webpage details provide code formatting, links for navigation, advertisements and so forth. This set of unwelcome noise with specific content on a web page allows it more challenging to retrieve and process the automatic details. For this, usable noisefree data must be extracted. In this research work, we introduce a technology focused on the observation method for noise removal. Noisy blocks typically include similar content and design types on a given website, while the key content blocks of the websites sometimes vary in their content or design styles. On this basis, the tree layout, called StyleTree (ST), is suggested to collect the existing types of design and page content of a specific web site. An ST for the domain that we call SiteStyleTree (SST) can be generated by sampling the pages of the website. This role is followed by a potential application of NeuralNetworks (NN) to obtain material knowledge in the combination of three frameworks classified with the DocumentObjectModel (DOM). The sort of neural network used to develop our method utilize the Back-Propagation (BP) method in NN. Data were obtained from various Web servers for training and research. To remove different noise variations on the internet, the classification effects of a BP-NN were used. Experiments prove that our way of extracting insightful content from these websites' webpages is applicable efficiently. The comparison of the proposed work with existing work Noise Web Data Learning (NWDL) is done by parameters noise classification and accuracy. Thus the proposed work produces a good level of accuracy.

Last modified: 2021-02-20 20:49:14