Text Detection in Document Images: Highlight on using FAST algorithm
Journal: International Journal of Advanced Engineering Research and Science (Vol.4, No. 3)Publication Date: 2017-03-08
Authors : Geetika Mathur; Suneetha Rikhari;
Page : 275-284
Keywords : Corner point; FAST(Features from Accelerated Segment Test); OCR; multilingual documents; handwritten documents.;
Abstract
In recent years, text extraction from document images is one of the most widely studied topics in Image Analysis and Optical Character Recognition. These extractions of document images can be used for document analysis, content analysis, document retrieval and many more. Many complex text extracting processes Maximization Likelihood (ML), Edge point detection, Corner point detection etc. are used to extract text documents from images. In this article, the corner point approach was used. To extract document from images we used a very simple approach based on FAST algorithm. Firstly, we divided the image into blocks and their density in each block was checked. The denser blocks were labeled as text blocks and the less dense were the image region or noise. Then we check the connectivity of the blocks to group the blocks so that the text part can be isolated from the image. This method is very fast and versatile, it can be used to detect various languages, handwriting and even images with a lot of noise and blur. Even though it is a very simple program the precision of this method is closer or higher than 90%. In conclusion, this method helps in more accurate and less complex detection of text from document images.
Other Latest Articles
- The impact of applying the Unified Banking Evaluation model (Camels) on enforcing the banking supervision of commercial banks (The case study of Bank Bemo Saudi French – BBSF)
- PRECONDITIONS FOR SUSTAINABLE DEVELOPMENT OF TATARSTAN'S POWER INDUSTRY
- Breast Cancer Diagnostic System Based on MR images Using KPCA-Wavelet Transform and Support Vector Machine
- Enhancement of Natural Ventilation using Solar Chimney: A Numerical Investigation
- Hybrid Model Based on User Tags and Textual Passwords and Pearsonian Type III Mixture Model
Last modified: 2017-04-02 18:57:24