Recognition of Similar Shaped Handwritten Marathi Characters Using Artificial Neural Network

Table of contents

1. Introduction

n recent years, handwritten Marathi character recognition has grabbed a lot of attention as Marathi being primary official language in Maharashtra has wide application in areas like passport, railways, postal address reading etc. Handwritten character recognition [4] consist of six main steps.

2. i.

Handwritten drawn of character ii.

Training on handwritten characters. iii.

Testing on handwritten characters. ? Pixel rating an Image ? Image smoothing ? Features extraction. ? Image segmentation. ? Patterns matching ? Result display as per their % of pattern matched.

Earlier, traditional classifiers such as Nearest Neighbor (NN), Hidden Markov Models (HMM) etc. were adopted for character recognition, however they exhibit certain limitations. Machine learning (ML) algorithms [6] provide a promising alternative in character recognition based on the feature set given to them. Each character image sample can be expressed in terms of some 26

Author ? : M.E (Scholar), Prof.Ram Meghe Institute of Research & Technology, Badnera. E-mail : [email protected] Author ? : Associate Prof. Prof. Ram Meghe Institute of Research & Technology, Bandera. E-mail : [email protected] Computer Science & Information Technology (CS & IT) quantifiable attributes called features. A variety of features can be extracted such as primitives, profiles etc. Multi Layer (ML) algorithm is then trained with this list of measured features, so that it maps these input features onto a class among certain predefined classes [1,2]. Then the classifier can be used to determine the class of unknown samples used for testing.

ii.

3. Literature survey

Character recognition task has been attempted through many different approaches like template matching, statistical techniques like NN, HMM, Quadratic Discriminant function (QDF) etc. Template matching works effectively for recognition of standard fonts, but gives poor performance with handwritten characters and when the size of dataset grows. It is not an effective technique if there is font discrepancy [4]. HMM models achieved great success in the field of speech recognition in past decades, however developing a 2-D HMM model for character recognition is found difficult and complex [5]. NN is found very computationally expensive in recognition purpose [6]. N. Araki et al. [7] applied Bayesian filters based on Bayes Theorem for handwritten character recognition. Later, discriminative classifiers such as Artificial Neural Network (ANN) and Support Vector Machine (SVM) grabbed a lot of attention. In [3] G. Vamvakas et al. compared the performance of three classifiers: Naive Bayes, K-NN and SVM and attained best performance with SVM. However SVM suffers from limitation of selection of kernel. ANNs can adapt to changes in the data and learn the characteristics of input signal [8].Also, ANNs consume less storage and computation than SVMs [9]. Mostly used classifiers based on ANN are MLP and RBFN. B.K. Verma [10] presented a system for HCR using MLP and RBFN networks in the task of handwritten Hindi character recognition. The error back propagation algorithm was used to train the MLP networks. J. Sutha et al. in [11] showed the effectiveness of MLP for Tamil HCR using the Fourier descriptor features. R. Gheroie et al. in [12] proposed handwritten Farsi character recognition using MLP trained with error back propagation algorithm. Computer Science & Information Technology (CS & IT) 27 similar shaped characters are difficult to differentiate because of very minor variations in their structures. In [13] T. Wakabayashi et al. proposed an F-Ratio (Fisher Ratio) based feature extraction method to improve results of similar shaped characters. They considered pairs of similar shaped characters of different scripts like English, Arabic/Persian, Devnagri, etc. and used QDF for recognition purpose. QDF suffers from limitation of minimum required size of dataset. F. Yang et al. in [14] proposed a method that combines both structural and statistical features of characters for similar handwritten Chinese character recognition. As it can be seen that various feature extraction methods and classifiers have been used for character recognition by researchers that are suitable for their work, we propose a novel feature set that is expected to perform well for this application. In this paper, the features are extracted on the basis of character geometry, which are then fed to each of the selected ML algorithms for recognition of SSHMC.

iii.

4. Proposed method

In this paper, we proposed a novel method based on combinations on pixel rating an image, feature extractions and Image pattern matching. Proposed method gives considerable expected outputs than previous proposed character recognition algorithms like HMM, NN, ML etc. Proposed method consists of following phases. ----------------- ----------Eq.3. From the study of Literature survey and proposed method, we conclude that, proposed method gives considerable and expected accuracy than previous character recognitions techniques like HMM, ML, NBP etc. Experiment results shows that, proposed method achieved an accuracy nearer to 98%provided no. of training samples per standard Marathi images should be maximum as possible as.

In the process of recognizing handwritten character, human brains may fails that's why to keep an expectations to achieve 100% accuracy is not expectable. A future work is needed to correctly analyze segments patterns and fuzzy rules mentioned in an equation 3.5.1 to achieve better accuracy which should be independent of no. of training set images.

Figure 1.
a) Training In this proposed work, Training on images consists of listing all handwritten images with respect to its standard Marathi character images. See the training set data in the following figure. In Training phase, we are having 51 Marathi characters and we Trained 20 handwritten characters with respect to each corresponding standard Marathi character images. The no. Of images in Training affects turnaround time of entire process execution. Where T i =Turnaround time. N=No. of Training images. An accuracy of result depends on the no. of trained handwritten images per standard Marathi character image. Where A c = Accuracy of character Recognition H im =No. of handwritten patterns. S im = Standard Marathi image=1 b) Testing Testing is a phase where no. of Marathi handwritten character image is tested against training set handwritten images. Testing consists of following phases: c) RGB-to-Binary image conversion As we maintained Training images as a binary images, there is need to convert testing image into binary image. Binary image avoids unnecessary image segmentation and features extraction. d) Pixel rate an imagePixel rate an image used to identify an image pixel value either on or off. We set pixels height and width equal to 10 of which we got result shown in above table 3.2.2.2. Height, Width of an pixel rate image should be proportionate to size of an binary image.Global Journal of Computer Science and TechnologyVolume XII Issue XI Version I --------------Eq.3.1.1 -
Figure 2.
size 10 x 10 leads to loss of pixels which are represented with an equation Where P l = No. of Pixels loss P nhw = Pixels new height, width. n=No. of off pixels in binary image. e) Image edge smoothing From literature review, it is observed that patterns in handwritten characters have large deviation factor with respect to its standard image pattern. Deviation factor is a mod difference value between no. of pixels patterns in training and testing images and represented with Where D f = Deviation factor. T r = No. of segments of training image. T s = No. of segments of testing image. Where S= size of an image. hw= new height, width of segment. D n = new segment size Average searching time for character recognition of test image uses best-fit search approach with training set images. Where O (t) =Average searching time. f) Image segmentation To recognize character, segmentation is done based on their patterns of size 2x2 as per equation 3.3.2. g) Pattern Matching We form fuzzy pattern match of pixels value (Equivalent decimal value) of segmented image based on which we match an object array patterns with following fuzzy rules: Where P tm = fuzzy pattern match. S= size of training image=size of testing image.
Figure 3.
Figure 4. Table 3 .2.2.1
3
Input handwritten image Input binary image
Input binary image Pixel Rate Image
Figure 5. Table 3 .2.2.2
3
D D D D )
(
Note: D 2012 Year From the table 3.2.2.2, it is observed that having pixels
1

Appendix A

  1. Handwritten Hindi Character recognition Using Multilayer Perceptron and Radial Basis Function Neural Networks. B K Verma . Proceedings of IEEE International conference on Neural Networks, (IEEE International conference on Neural Networks) 1995. 4 p. .
  2. Performance evaluation of pattern classifiers for handwritten character recognition. C L Liu , H Sako , H Fujisawa . International Journal on Document Analysis and Recognition (IJDAR) 2002. 4 p. .
  3. Comparison of SVM and ANN performance for handwritten character classification. F Kahraman , A Capar , A Ayvaci , H Demirel , M Gokmen . Proceedings of the IEEE 12th Signal Processing and Communications Applications Conference, (the IEEE 12th Signal Processing and Communications Applications Conference) 2004. p. .
  4. An Improved Method for Similar Handwritten Chinese Character Recognition. F Yang , X D Tian , X Zhang , X B Jia . IEEE Third International Symposium on Intelligent Information Technology and Security Informatics (IITSI), 2010. p. .
  5. Figure 2. Misclassification Rate of Bayesian Network and C4.5 with FULL,CFS and CON Features Sets 34 Computer Science & Information Technology. G Vamvakas , B Gatos , S Petridis , N Stamatopoulos . IEEE Ninth International Conference on Document Analysis and Recognition(ICDAR ), 2007. CS & IT. 2 p. . (An Efficient Feature Extraction and Dimensionality Reduction Scheme for Isolated Greek Handwritten Character Recognition)
  6. An HMMRF-Based Statistical Approach for Off-line Handwritten Character Recognition. H S Park , S W Lee . IEEE Proceedings of the 13 th International Conference on Pattern Recognition, 1996. 2 p. .
  7. Offline handwritten character recognition of Gujrati script using pattern matching. J R Prasad , U V Kulkarni , R S Prasad . IEEE 3rd International Conference on Anti-counterfeiting, Security, and Identification in Communication, 2009. p. .
  8. Neural Network Based Offline Tamil Handwritten Character Recognition System. J Sutha , N Ramaraj . IEEE International Conference on Computational Intelligence and Multimedia Applications, 2007. 2 p. .
  9. Recognition of Online Isolated Handwritten Characters by Back propagation Neural Nets Using Sub-Character Primitive Features, M Zafar , D Mohamad , M M Anwar . 2006. IEEE Multitopic Conference. p. .
  10. A Statistical Approach for Handwritten Character Recognition Using Bayesian Filter. N Araki , M Okuzaki , Y Konishi , H Ishigaki . IEEE 3rd International Conference on Innovative Computing Information and Control, 2008. p. .
  11. An Overview Of Character Recognition Focused On Off-line Handwriting. N Arica , F T Yarman-Vural . IEEE Transactions on Systems, Man, and Cybernetics 2001. 31 p. .
  12. Bayesian Network Classifiers, Machine learning, N Friedman , D Geiger , M Goldszmidt . 1997. p. .
  13. WEKA 3: Data Mining With Open Source Machine Learning Software. N J Nilsson . http://www.cs.waikato.ac.nz/ml/weka/ JAVA 1998. 16. Robotics Lab, Deptt of Computer Science, Stanford University (An early draft of a proposed textbook)
  14. Handwritten Farsi Character Recognition using Artificial Neural Network. R Gharoie , M Farajpoor . International Journal of Computer Science and Information Security 2009. 4.
  15. , T M Mitchell . Machine Learning December 1997.
  16. Fratio Based Weighted Feature Extraction for Similar Shape Character Recognition. T Wakabayashi , U Pal , F Kimura , Y Miyake . IEEE 10th International Conference on Document Analysis and Recognition, 2009. p. .
Notes
1
© 2012 Global Journals Inc. (US)
Date: 2012-01-15