Development and evaluation of artificial neural network (ANN) algorithms for recognition of Arabic script

number: 
226
English
Degree: 
Author: 
Basil Isam Sarsam
year: 
1997
Abstract:

An experimental Artificial Neural Network (ANN) system for Optical Character Recognition (OCR) has been designed, implemented and tested with handwritten Arabic script. The present system has been given the acronym EXANNAS: Experimental ANN system for Arabic Script recognition. ANN algorithms have been developed and implemented on EXANNAS for recognition of Arabic script. Classification of characters is performed depending on their structural features. The structures of characters have been identified through extraction of the Hough Transform (HT). Classification was performed with a Multi-Layer Perceptron (MLP) using the backpropagation learning rule. Properties have been assigned to each character resulting in the categorization of the Arabic character set into eight groups. In EXANNAS, a new dot-isolation algorithm has been implemented for detecting and clearing dots in the image of the text. Segmentation is based on an improved method of detecting silent regions in the text. The extraction of the HT for the individual character has been implemented on a single-layer perceptron (SLP) with local neighbourhood connections. The MLP classifier decides the character class within its category. The final class of the character is obtained depending on the category class and the attached property code of the character. Character preparation stages have been tested through different examples. Training results have been obtained for four different phases. EXANNAS results in a recognition rate of 92.7%, for Arabic handwritten text. When EXANNAS was first tested with characters from the training set, it showed a recognition rate of 99%.