Hand written text evaluation using Gabor Filter and K Nearest Neighbour

DOI : 10.17577/IJERTCONV8IS05039

Download Full-Text PDF Cite this Publication

Text Only Version

Hand written text evaluation using Gabor Filter and K Nearest Neighbour

Albino Alex Braganza1 School of Computer Science Dr. Vishwanath Karad MIT World Peace University, Kothrud, Pune, Maharashtra, India

Amit Gour2

School of Computer Science Dr. Vishwanath Karad MIT World Peace University, Kothrud, Pune, Maharashtra, India

Ravi Kumar3

School of Computer Science Dr. Vishwanath Karad MIT World Peace University, Kothrud, Pune, Maharashtra, India

Ms. Smita.Patil4

Assistant Professor School of Computer Science Dr. Vishwanath Karad MIT

World Peace University, Kothrud, Pune, Maharashtra, India

Abstract- Image processing has helped us simplify intensive task such as object recognition, face detection and tracking, number plate challan system, detecting whether motorist have worn helmets and the list can go. Evaluating candidates on how they performed during a test, exam or interview rounds is a tedious process which takes a lot of time to evaluate each candidates answer sheets separately. One way would be to use optical mark recognition (OMR). But using this system the candidates would be constrained on giving out his/her thoughts as the system only provides MCQ type answers. Giving the openness to candidates to answer broadly about the topic would help us understand how much the student has understood the topic covered during his/her learning process. But evaluation time is the biggest factor that we need to tackle.The objective of researcher is to develop a system which takes a scanned photo of an answer sheet and evaluate it on the basis of detecting and removing scribbled words, Converting the entire handwritten image into character format, Detecting a single paragraph, Forming separate sentences, Evaluating the formed sentence if they are grammatically correct using Natural Language Processing, and finding out how close the sentence matches the content of the topics on which the question is asked. This type of system will save the evaluation time and manpower in traditional examination system. It could also be used to make a digital archive of the answer sheets.

This research paper mainly focuses on creating a dataset of character image and using Gabor Filter to extract features to create a training dataset for the K nearest neighbour algorithm. The author aimed to identify good amount of accuracy of to detect characters and on successful result of character accuracy author aims to detect all the words from the image.

Keywords- Image Processing, OMR, Scribbled Characters, Natural Language Processing, Evaluation System

  1. INTRODUCTION

    Character recognition has improved drastically over the past years. Off line hand written text recognition use different techniques to get a good amount of accuracy .When we compare offline hand written of people, each one has different style of writing which widens the probability of recognising certain characters. Factors like character size, tilt spacing between words have a huge impact on the recognition process. Many algorithms have been developed that use different filters to extract features of a particular character. Features could be angle, intensity and thickness of stokes.

    In this research we are using the Gabor filter to extract features of each character and use this features to train the Kth Nearest Neighbour clustering algorithm.

  2. PROPOSED SYSTEM

The system will take in scanned images which consist of hand written text. It will then convert the entire hand written image into computer readable format. If in case there exist any scribbled words, they will be remove by the system. The next step is to use Natural Language Processing to evaluate the content and rate the it on the similarity index by comparing it to contents gathered from other sources.

A. Architecture of the proposed system-(Flow Diagram)

Working and Algorithms

The prerequisite is to scan all the images before it can be given to the system. The system will take a scanned copies of the hand written page and convert it into binary images that is black and white where 0s indicate black and 1s will represent white. The purpose of this step is to collect intensity features of each character in the image. These features will be compared to the trained dataset. If not then the found character is either scribbled or not a valid character.

Once the system iterates through the entire image, the final result will be a text document which will be in computer readable format.

  1. Gabor filter

    The filter comes handy in computer vision in areas where texture analysis needs to be carried out or edge detection or feature extraction. Gabor is function based on the following parameters (X, Y) kernel or window size, Standard deviation, Angle, Wavelength, Aspect Ratio, Phi.

    The Gabor filter is a band pass filter where we change the above values to extract the desired features from the image. Each Gabor filter is known as a kernel, we can create multiple kernel known as a filter bank, to extract smaller features from the image

    Figure 1- Input Image

    Figure 2-After the Gabor Filter is applied

    Figure 3-Gabor Filter Kernel (45 degree)

    Gabor Filter Change Theta(angle)

    After the Gabor filter is applied only those areas are displayed where the features match the filter and discards the rest.

    Figure 4-Knn classes

    The following steps are performed to understand how KNN Classifier works

    Step 1

    For implementing any algorithm, we need dataset. So during the first step of KNN, we must load the training data from the Gabor filter bank which is used to extract features from each character.

    Step 2 Next, we need to choose the value of K i.e. the nearest data points. K can be any odd integer.

    Step 3 For each point in the test data do the following

    • 3.1 Calculate the distance between test data and each row of training data with the help of any of the method namely: Euclidean, Manhattan or Hamming distance. The most commonly used method to calculate distance is Euclidean.

    • 3.2 Now, based on the distance value, sort them in ascending order.

    • 3.3 Next, it will choose the top K rows from the sorted array.

    • 3.4 Now, it will assign a class to the test point based on most frequent class of these rows.

    Step 4 End

    Figure 5-Character Dataset with input image character b.

  2. K Nearest Neighbour

  1. nearest neighbours (KNN) algorithm uses feature similarity to predict how close a given feature matches with the existing data set that the knn model is trained.

    Once the features are extracted from the input image using the Gabor filter. Knn is used to predict which class this feature matches to find the correct character.

    Figure 6-Nearest neighbour using Knn.

    CONCLUSION

    The proposed idea works theoretically, and we are working on its implementation. As mentioned earlier we are creating a data set of hand written characters and exploring the Gabor filter to create our own feature extraction filter that will specifically be used to extract important features from the input image and train the K Nearest Neighbour algorithm. Next we will be researching on Natural Language Processing to check whether the converted hand written image to computer image is grammatically makes meaningful sentences. Later on we will integrate all the modules to reach the final result that is, evaluating the scanned hand written image withthe contents from the specified resources and give an output of its similarity index.

    REFERENCES

    1. Xuewen Wang, Xiaoqing Ding and Changsong Liu ,Gabor filters based feature extraction for character recognition, Pattern Recognition 38(2005) page 369-379

    2. J. Zhang, X Ding, Multi-Scale feature extraction and nested subset classifier design for high accuracy handwritten character recognition, Proccedings of the ICPR 2000,2000

    3. Yun-lei Cai, Duo Ji ,Dong-feng Cai ,A KNN Research Paper Classification Method Based on Shared Nearest Neighbor, Proceedings of NTCIR-8 Workshop Meeting, 2010, Tokyo, Japan

Leave a Reply