Information Extraction from Pre-Preprinted Documents

Journal: INTERNATIONAL JOURNAL OF COMPUTERS & DISTRIBUTED SYSTEMS (Vol.2, No. 1)

Publication Date: 2012-11-15

Authors : Muheet Butt;

Page : 88-93

Keywords : Document Analysis; Knowledge Base; Form Recognition; Data Extraction; Region Analysis; Segmentation; Layout Analysis;

Source : Download Find it from : Google Scholar

Abstract

Document processing is an important step in office automation and it involves recognizing the static content to decide the type of the form, extracting the variant data from the form to recognize and then process the data as per requirement. The problem of information processing from forms is more structured in nature, while the recognition part is complex since the texts are usually handwritten. In this paper a method is proposed in which form type is recognized in terms of its heading and title layout composition to match the process of recognition as closely as possible to the human understanding system. For this purpose knowledge of different forms used in an organization is created by suitable apriori learning. Extraction of the information requires the recognizing the entries made in the space provided in the forms. Performance and evaluation of the method has shown promising results.

Main Menu

Searching By

PARTNERS

Information Extraction from Pre-Preprinted Documents

Abstract

Advertisement