OCR Technology for Detecting Homoglyph Spam
Journal: International Journal of Computer Science and Mobile Computing - IJCSMC (Vol.13, No. 6)Publication Date: 2024-06-30
Authors : Kostiantyn Shcherbyna;
Page : 22-26
Keywords : Homoglyphs; Text recognition; OCR; Spam; Recognition accuracy;
Abstract
This paper examines the effectiveness of Optical Character Recognition (OCR) technology in detecting spam composed of homoglyphs, visually similar characters from different scripts. The research utilizes a console application capable of generating all possible homoglyphic representations for specified words, which are then visually represented and analyzed using the OCR tool Tesseract. The study assesses the recognition accuracy of homoglyphs in different text cases: only uppercase, only lowercase, random case, and only the first letter uppercase. Findings reveal variable accuracy rates across these text formats, with uppercase letters generally showing higher recognition rates. This differential recognition underscores the challenges and potential for refining OCR applications to better detect and filter homoglyph-based spam, enhancing security across digital communication platforms.
Other Latest Articles
- Leak Detection in Pipeline using Arduino
- Zaffify: An Accessible Platform for Special Education High School Students in One City in the Philippines
- Legislative Data Archiving and Records Keeping System
- TO ASSESS KNOWLEDGE ATTITUDE AND PRACTICE REGARDING ANAPHYLAXIS AMONG DENTAL STUDENTS IN TERTIARY CARE TEACHING HOSPITAL KHAMMAM
- UNE METRORRAGIE INDUITE PAR LA PAROXETINE : A PROPOS DUN CAS CLINIQUE
Last modified: 2024-06-27 17:00:14