Identify Handwriting Individually Using Feed Forward Neural Networks

DOI : 10.17577/IJERTCONV1IS02021

Download Full-Text PDF Cite this Publication

Text Only Version

Identify Handwriting Individually Using Feed Forward Neural Networks

Identify Handwriting Individually Using FeeISdBN: 978-93-83758-09-8

Forward Neural Networks

Ankita Taneja M.Tech., K.U.K.

Ankiii.frnds@gmail.com

  1. INTRODUCTION

    In this paper we present the problem of automatic hand writer identification using scanned images of handwriting [1] [2] with feed forward neural networks. Identifying the authors of a handwritten sample using automatic image-based methods is an interesting pattern recognition problem with direct applicability in the legal and historic documents. The current study describes a number of new and very effective techniques and we will present a method using neural networks. Biometric modalities [3][4] are classified into two broad categories: physiological biometrics that perform person identification based on measuring a physical property of the human body (e.g. fingerprint, face, iris, retinal, hand geometry) and behavioural biometrics that use individual traits of a persons behaviour for identification (e.g. voice, gait, signature, handwriting). Writer identification therefore pertains to the category of behavioural biometrics. From the physical body property or the individual behaviour traits, biometric templates are extracted and used in the identification process. Biometric identification is performed by comparing the biometric template measured at the moment when the identification of an unknown person is needed with templates previously enrolled in a database and linked with certainty to known persons. Physiological biometrics, like fingerprint or are strong modalities for person identification due to the reduced variability and high complexity of the biometric templates used. However, these physiological modalities are usually more invasive and require cooperating subjects. On the contrary, behavioural biometrics is less invasive, but the achievable performance is less impressive due to the

    large variability of the behaviour-derived biometric templates. Leading a worrisome life among the harder forms of biometrics, the identification of a person on the basis of handwriting samples still remains a useful biometric modality, mainly due to its applicability in the forensic field.

  2. AN INTRODUCTION IN FEED FORWARD NEURAL NETWORK

    The feed forward neural networks are built from interconnected artificial neurons (processing units), grouped in three or more layers. The usually neurons used are McCulloch-Pitts [5]. Every connection has associated a real numerical value, named weight or the strength of connection shown in Figure 1.

    Figure 1. McCulloch-Pitts neuron model

    The neurons from the input layer are the only one that receives external signals. The processing units from one layer are totally connected with the neurons from the next layer. There are no connections between the neurons that belong to the same layer. Every connection has associated a real numerical value, named weight or the strength of connection. The Figure 2 illustrates the nearby of interconnection of the processing units, in a 4-5-1 feed forward neural network (i.e. a network that has 3 layers).

    \

    Figure 2. Neural network 4-5-1

    Generally speaking, for a certain application, in order to choose the architecture of the neural networks, we have to look to the input/output requests of the problem. The pattern recognition applications can be solved with a single hidden layer neural network. The number of input and output neurons is determined, by the specific problem. The number of hidden layers neurons is experimentally determinate. The most important stage is the design of a neural network is the training of the network. The most used training algorithm is the well- known back propagation algorithm. Its elaboration

    20 years ago, has determinate a massive growth of research in neural network. In the learning stage, the neural network is trained to learn a number of patterns (a path is a pair of inputs and desired outputs) with a specific error. The number of necessary patterns depends on the type and complexity of the problem. The inputs from the current pattern determine certain values for the outputs these output values are computed in the forward step of training). These calculated output values are compared with the desired output values (these are determinate by the current training pattern). The difference of the two values (the calculated one and the desired one) is back propagated.

    In the back propagation stage of the training, are adjusted all the weights, according to the delta rule. When the network has learned all the set of training patterns with the desired error, we say that the network converged. The speed of convergence is an important feature of a neural network, and it depends of some factors, like:

    • networks architecture (the number of hidden layers and the number of neurons in each layer)

    • the number of training patterns to be learned – the order of presentation for the training patterns. A single presentation of the entire set of patterns, to the neural network, in which all the weights are repeated by adjusted, according to the delta rule, is called training epoch. The speed of convergence of a neural network during the training period is measured in the number of epochs that are needed for the network to converge.

      1. b. c. d. e.

      Figure 3. Examples of transformation handwrite in text file: a. the original image, b. cropped image, c. – first cropped image, d. – second scale image, e. the binary image

      a)

      The most general form of the back propagation algorithm is the following [6]:

      REPEAT

    • all the weights are initializes with small random values.

    • for all the training patterns repeat:

      • forward propagation (i.e. computing the outputs of the neural network, for the inputs of the current pattern)

      • calculate the current training pattern learning error

      • back propagation (i.e. adjusting all the weights of the neural network according to delta rule)

    • calculate the total error of learning the entire set of training patterns

    UNTIL total error < = specified limit error.

  3. IMAGE PRE-PROCESSING

    The handwritten text was scanned using a 600dpi resolution. The result image was processed for optimal extraction of h letters. For this operation the image was converted to gray-scale. Every h letter existing in text was extracted and cropped. We resized the image using two steps. The first step was a scaling to 200 by 200 pixels. The second step was to resize the picture to 20 by- 20 pixels. This operation in two steps was made to reduce at minimum loss of information from the source image. After that, we converted the last image in to 20 by 20 binary matrix using 1 for the letter and 0 for background. The resulted binary matrixes were saved in text files. The entire image preprocessing was done by using original scripts in Matlab 2010b. The image preprocessing was necessary for reducing complexity of neural network, but this preprocessing must not lose big quantity of image information. The proposed method ensures these criteria by applying a two-step resizing. We have chosen four pages of Romanian known text for each writer and selected relevant letters which was the h letter. We have found about 50 h letters in the text. Ninety percent of the have in used to train the neural network and the rest of then to test the network. The number of used letters was large enough to mae the feed forward network to converge. To ensure a fast convergence of neural network used, we first initialized the network weights with small random values, ranging between -0, 5 and 0, 5.

    b)

    Figure 4. Pre-processing algorithm of letter occurrences in a) known text and b) unknown text

    We present examples of letters copied from the original texts which were scanned for image processing. For simplifying the process we already separated the h letters form the text.

    a)

    Written by Mr. Miu

    b)

    Written by Mr. Cigoeanu

    For each occurrence of a letter in the known text and in the foreign text result a matrix text file. We used the h letters for the next experiments. We proposed this method in the analysis and identifying of the historical documents for checking authenticity. It also can be used in the forensic analysis. Using a handwritten document database, we can train the neural network for obtaining the most probable author.

    a)

    c)

    Written by Mr. Grigore

    Figure 5. Selected letters from known text written by a) Mr. Miu, b) Mr. Cigoeanu and Mr. Grigore

  4. WRITER IDENTIFICATION FORWARD NEURAL NETWORKS

    There are two fundamental rules in handwriting analysis: two or more writers cannot have the same handwriting, but not the same writer has to write the same letter twice. For this reason, the neural networks are used for pattern recognition, respectively handwriting. In our paper, we used an algorithm for feed forward neural network and we ensure an appropriate analysis. In the papers which analyze and identify for example calligraphic feature, textural feature methods and so on were difficult and in the same time require complex preprocessing.

    We propose neural networks usage, for which less complex image preprocessing is needed. The results obtained using these methods are very similar to all the other methods. With text files containing the binary images we trained the feed forward neural network. The feed forward neural network architecture is: 400 neurons in input layer, 200 neurons in one hidden layer, and 3 neurons in output layer (in this case for three hand writers). The learning error imposed is 0,001.

    b)

    Figure 6. The main steps of a) first part of application and b) second part of application

    First we created a database with the matrix text files obtained from the pre-processing of the images containing wanted letters. The dimension of the database used in this experiment is about 50 samples

    per text. For the training of feed forward neural network we evaluated 90%. The rest of 10% were used to test the neural network. The page of the unknown writer was scanned at the same resolution as the others and follows the same route of pre-processing. The result of each found letter processing by neural network is stored in a vector for computing a mean. The probability of identifying a writer is given by the above mean.

  5. EXPERIMENTAL RESULTS

    For recognition of the hand writer we use following h letter:

    Mr. Cigoeanu Mr. Grigore Mr. Miu

    After training of feed forward neural network with samples of three known hand writers we verify if the feed forward neural network is good trained. By analyzing the three unknown letters, neural network analyzed and decided the next results: We consider a good threshold is 75%. Comparing to this threshold the results are satisfying.

    Analyzing result of three writers: Mr. Grigore, Mr. Cigoeanu, Mr. Miu, we observed that unknown writer is Mr. Miu with 95,39% probability percent, Mr. Grigore with 89,86%, and Mr. Cigoeanu with 97,65%. Results verify the fundamental dogmas underpinning handwriting identification: no two people write exactly alike and no one person writes

    exactly the same way twice.

  6. CONCLUSIONS AND FUTURE

    WORK

    Identify handwriting remains a difficult open problem. In the current paper, we have presented a study-case that is relevant for the research projects. In this work there were presented some practical aspects of the automatic handwriting identification using scanned images with feed forward neural networks.

    Our goal in this paper was to automate the process of writer identification using scanned images of handwriting and thereby to provide an analysis of handwriting using feed forward neural networks. Experimental results are good and we propose using in field of handwriting recognition (analyze of handwriting). Considered in the general context of biometrics, automatic writer identification and verification is presently a thriving research topic. It is also a very engaging one. In future we are going to implement other methods for analyzing the handwriting.

  7. REFERENCES

  1. Bensefia, A., Paquet, T. and Heutte, L., Handwritten document analysis for automatic writer recognition, Electronic Letters on Computer Vision and Image Analysis 5(2), pp. 7286, 2005.

  2. Bensefia, A., Paquet, T. and Heutte, L., A writer identification and verification system,Pattern Recognition Letters 26(10), 20802092, 2005.

  3. Bulacu, M. and Schomaker, L., Analysis of texture and connected-component contours for the automatic identification of writers, Proc. of 16th Belgium- Netherlands Conference on Artificial Intelligence (BNAIC 2004), Groningen, The Netherlands, pp. 371372, 2004.

  4. Bulacu, M., Statistical Pattern Recognition or Automatic Writer Identification and Verification, PhD thesis, 2007.

  5. Hassoun, M.H., Fundamentals of Artificial Neural Networks, MIT Press, Cambridge, London, England, 1995.

  6. Zeidenberg, M., Neural Networks in Artificial Intelligence, Ellis Horwood Limited, England, 1991.

Leave a Reply