Machine Learning For Real Estate Contracts Automatic Categorization of Text
Journal: International Journal of Computer Techniques (Vol.3, No. 2)Publication Date: 2016-03-01
Authors : C.Mani J.Jayasudha;
Page : 34-39
Keywords : Artificial Intelligence; Machine Learning; Mining; Automatic text classification; feature extraction; pre-processing; text mining; Natural Language Processing;
Abstract
Automatic Text Classification is a machine learning task that automatically assigns a given document to a set of pre-defined categories based on its textual content and mined features. Automatic Text Classification has important applications in content management, contextual search, estimation mining, product review analysis, spam filtering and text sentiment mining. This paper explains the generic strategy for automatic text classification and analyses existing solutions to major issues such as dealing with unstructured text, handling large number of features and selecting a machine learning technique appropriate to the text-classification application. There are statistical model, rule based model, hybrid model. Statistical model is based on training text which configured in each categories, Rule Based model is based on rules like Positive term, Negative term, Relevant term, Irrelevant term. Positive term list of mandatory terms. Negative Term list of excluding terms. Relevant Term list of relevant terms. Irrelevant Term list of irrelevant terms. Hybrid model is combination of statistical and rule based model. Hybrid model will give the accurate result. At first model will be created as statistical model to get the exact result later for fine tuning process have to add terms so at last the model will look as hybrid model. We will discuss in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.
Other Latest Articles
- Accident Detection and Ambulance Rescue using Raspberry Pi
- The Encryption Algorithm Based on Principal Component Analysis of Times Series
- Computer Vision: Pedestrian Detection Algorithm for Traffic Light Control System
- Mechanical Engineering in Ancient Egypt: Part VI: Jewellery Industry (Royal crowns and Headdresses from 19th to 30th Dynasties)
- Design and Implementation of 10-Bit Pseudo Random Sequence Generator for 50 MHz
Last modified: 2018-05-18 19:04:42