Urdu Optical Character Recognition Technique for Jameel Noori Nastaleeq Script
Journal: Journal of Independent Studies and Research - Computing (Vol.13, No. 1)Publication Date: 2015-06-01
Authors : Engr. Reema Qaiser Khan Engr. Wafa Qaiser Khan;
Page : 81-86
Keywords : ;
Abstract
Urdu OCR's have been an object of interest for many developers in the recent years. Active research is being done pertaining to Urdu OCR's, but because of the complexity associated with Urdu fonts; it still lacks perfection halting it from coming up to the surface. The main objective was to create a technique that could be applied to any of the existing Urdu fonts/scripts. In this paper, the authors have developed a technique which is capable of extracting the Urdu font “Jameel Noori Nastaleeq” from images and converts it into editable textual Unicodes. The approach comprises of pre-processing techniques, label connected components, feature extraction, and image comparison. The identified objects are saved as templates which are then compared to the white pixel position length database created by the authors in order to identify the templates which are then converted into Unicode.
Other Latest Articles
- Enhancing Data Quality using Human Computation and Crowd Sourcing
- A Semi-supervised approach to Document Clustering with Sequence Constraints
- Standard Framework for Comparison of Graph Partitioning Techniques
- Local goverment investment expenditure in poland's viovodships: 2007-2013 financial perspective
- An Investigation on Topic Maps Based Document Classification with Unbalance Classes
Last modified: 2018-07-17 01:10:06