Information Extraction from Pre-Preprinted Documents
Journal: INTERNATIONAL JOURNAL OF COMPUTERS & DISTRIBUTED SYSTEMS (Vol.2, No. 1)Publication Date: 2012-11-15
Authors : Muheet Butt;
Page : 88-93
Keywords : Document Analysis; Knowledge Base; Form Recognition; Data Extraction; Region Analysis; Segmentation; Layout Analysis;
Abstract
Document processing is an important step in office automation and it involves recognizing the static content to decide the type of the form, extracting the variant data from the form to recognize and then process the data as per requirement. The problem of information processing from forms is more structured in nature, while the recognition part is complex since the texts are usually handwritten. In this paper a method is proposed in which form type is recognized in terms of its heading and title layout composition to match the process of recognition as closely as possible to the human understanding system. For this purpose knowledge of different forms used in an organization is created by suitable apriori learning. Extraction of the information requires the recognizing the entries made in the space provided in the forms. Performance and evaluation of the method has shown promising results.
Other Latest Articles
- An Efficient Density based Improved K- Medoids Clustering algorithm
- Implementation of DSR Protocol by Using Distributed Cache Update Algorithms
- A Modified & Extended Security Quality for Requirements Engineering (SQUARE) Methodology into Standard Life-Cycle models
- QUANTUM CRYPTOGRAPHY: A NEW GENERATION OF INFORMATION SECURITY SYSTEM
- TRANSFORMATION OF MULTIPLE ENGLISH TEXT SENTENCES TO VOCAL SANSKRIT USING RULE BASED TECHNIQUE
Last modified: 2016-07-02 19:34:51