 Open Access
 Total Downloads : 443
 Authors : Aanal Patel, Dhiren Pandit, Neel Acharya
 Paper ID : IJERTV5IS080475
 Volume & Issue : Volume 05, Issue 08 (August 2016)
 DOI : http://dx.doi.org/10.17577/IJERTV5IS080475
 Published (First Online): 29082016
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Android based Mathematical Expression Evaluation from Images
Aanal Patel
IT Dept., LDRPITR,
Sector15, Gandhinagar, Gujarat, INDIA
Neel Acharya
CE Dept., LDRPITR,
Dhiren Pandit
Science & Humanities Dept., LDRPITR, Sector15, Gandhinagar,
Gujarat, INDIA
Sector15, Gandhinagar, Gujarat, INDIA
AbstractCurrent era is an era of automated systems and machine learning plays vital role in construction of automated systems. In machine learning, neural network is key tool. OCR is used for recognition of characters from printed text and mathematical tool of neural network is used for classification. Proposed system combines OCR and Mathematics, hence a system is designed that can compute mathematical expressions from images which contains equations. OCR systems can be used to convert the image of expression to a string of mathematical
equals, notequals, lessthan, greaterthan, lessthan or equals, greaterthan or equals) Boolean operation (like and, or, not) are supported. Also trigonometric functions (like sine, cosine,
…) and other complex mathematical operations (like ceil, floor, log, …) are supported.

Flow Diagram
expressions and then combined with Mathematical libraries and compiler to obtain solution of such mathematical expressions
Keywords OCR, Expression, Android, Neural Networks, Evaluation, Image Processing, Character detection, Printed Character Recognition

INTRODUCTION
ClosedForm Expression evaluation is one of the easiest task in mathematics. Its Computer application is also one such easy task. With builtin operators many mathematical simple equations can be easily solved using computers. However, with increase in mathematical complexity the solution of the equation becomes increasingly tougher task. This is due to need of more complex techniques of problem solution and more complex algorithmic procedures. Many such systems exist today that help us solve complex mathematical equations. With combination of such systems and an Optical Character Recognition system this problem can be transformed to a new domain of printed expression calculation.

MEXPRESSION FOR MATHEMATICAL EXPRESSION EVALUATION
MExpression is one such android and java based system
Start
Load image
PreProcess image
Image API
Detect characters and separate them
Character detection error?
No Classification task / OCR
(NeuralNetwork)
Error in OCR?
No
Expression in string form
Yes
Yes
under development that does the task of printed expression calculation. The frontend is an android application that allows a user to choose an image of printed expression which is then forwarded to the java based backend that does the task of OCR and then obtains the expression in a form that the java code can calculate it. This is then further processes with a math library to compile and find errors if any. In case of no errors, the expression is evaluated and a response comprising of calculated answer is sent to the frontend. For the task of OCR, a neural network based approach has been taken that uses a java library to train the dataset of fonts which can be used to classify the test image. Along with a basic set of mathematical operations (like addition, subtraction, multiplication, division, power), relational operations (like
Connector 2
Fig. 1. Flowchart part1
Connector 1
Connector 2
Mathematical processing
Compile expression
Error ?
No
Yes
Connector 1
Yes End
Answer
in which an output pixels value is determined as a weighted sum of input pixel values ( ). But most useful filter is Gaussian filter. Gaussian filtering is done by convolving each point in the input array with a Gaussian kernel and then summing them all to produce the output array. Following is an image of 1D Gaussian kernel.
Fig. 4. 1 D Gaussian kernel
Assuming that an image is 1D, it can be noticed that the pixel located in the middle would have the biggest weight. The weight of its neighbors decreases as the spatial distance between them and the center pixel increases.
2D Gaussian can be represented as:


Preprocessing
Evaluate Expression
Evaluate Expression
Fig. 2. Flowchart part2
Where is the mean(peak) and represents the variance.
Once an image of mathematical expression is loaded in the system, the image is preprocessed. Preprocessing step includes enhancing the image and grey scale conversion. This step makes the image ready to be properly processed and removes chances of minor errors in the OCR process [1]. For this purpose, different filters are used. In this work following filters are used for better recognition rate.
1. RGB to Grayscale (Dimension Reduction)
Given input colored image has three dimensional pixel value (RGB). Hence matrix obtained for this input colored image is three dimensional matrix. It is quite difficult to perform image processing technique on three dimensional matrix, hence for suitable and smoother processing the image is converted in gray format. A gray image has two dimensional pixel value which lies between [0, 255] as per the gray value of a pixel [2].
Dimension Reduction
Input Image Grayscale Image
Fig. 3. Dimensional Reduction using RGB to Gray

Gaussian Blur filter
Once grayscale image is obtained, filters are applied for smoothing image. Smoothing, also called blurring, is a simple and frequently used image processing operation. Many reasons are there for use of this filter but here smoothing filters are used in order to reduce noise. There are many filters for smoothing but the most common type of filters are linear,
Fig. 5. Effect of Gaussian filter

Adaptive threshold
Image binarization or thresholding is an important tool in image processing and computer vision, to extract the object pixels in an image from the background pixels. A number of methods have already been proposed for image thresholding. Bilevel image is used as a preprocessing unit in several applications. The use of binary images decreases computational load for the overall application. These applications include document analysis, optical character recognition system, scene matching, quality inspection of materials etc. The thresholding process computes the threshold value that differentiate object and background pixels. Under varying illumination and noise, the thresholding can become a challenging job. A number of factors contribute to complicate the thresholding scheme including ambient illumination, variance of gray levels within the object and the background, inadequate contrast, object shape and size noncommensurate with the scene. A wrong selection of threshold value may misinterpret the background pixel and can classify it as object and vice versa, resulting in overall degradation of system performance. In document analysis, thresholding is sensitive to noise, surrounding illumination, gray Ievel distribution, local shading effects, inadequate contrast, the presence of dense nontext components such as photographs, etc. There are a number of important performance requirements that need
to be considered while thresholding gray level images. These include:
a) Loss of features after thresholding input image should be zero or minimum. b) The features (objects) with similar relative gray levels should have same binary values in the processed output image. c) Thc effect of noise of minor gray level variations should be eliminated.
N1 N2 N3 N4
Nn Input Layer
Hidden
A B C D Layer
W X Y Z
Fig. 6. Effect of Adaptive threshold


Character detection
Next the system will separate the image into individual
M1 M2 M3
Output Layer
Mm
character images. This is done by reading the preprocessed greyscale converted image pixel by pixel. A group of black pixels spaced by a minimum threshold is considered as a character in the image. This can be achieved by moving a variable sized window over the image pixel array and then adjusting the window till a suitable group of pixels with predefined threshold distance has been found [3][4].

Classification / OCR
After detecting each character, the next job is to classify the characters into the clusters of predefined characters. To be straight to the target with the arrow, this step involves OCR technique to detect the character. For this purpose, we use a properly trained neural network that does our job.
Neural network consists of n input neurons. Where n=(height)x(width) of individual character in the detection step [5][6][7].
The number of output neuron is m, where m=the number of total detectable characters.
There could be any amount of hidden neurons but it was experimentally found that 11 to 15 hidden neuron layers give optimal result. (IN our case it is 12 hidden layers) [8][9] [10].
Multilayer perceptrons can easily learn tougher patterns & complex decision boundaries using feed forward and back propagation algorithms.
For any input N = (N1, N2, N3, , Nn) are considered to be input layer. Every connection between 2 neutrons/perceptrons, there is an associated weight, Wab (Weight of a connection from node an in a layer to node b in next layer).
Fig. 7. Neural network: Input layer, Hidden layer, Output layer
Similarly, there are output neurons M = (M1, M2, M3, , Mm) that provide output of the network. In our case the input values, (N1, N2, N3, , Nn) consists of the pixels of the input character detected. The output layer in our case consists of m node (neurons) each representing a character and the output at any output node (neurons) is either 1 or 0, 1 at the place of character detected & 0 for the rest of outputs. This outputs are ideal & may deviate slightly in practical world.
There is a connection between input node (neuron) & output node (neuron) via one or more hidden nodes (neurons).
The value of any node is the weighted sum of all the input values to the node.

Initialization network weights

Until the termination:
{
For each training example:
{
Propagate the input forward to the network and compute the observed outputs.
Propagate the errors backward as follows:
For each network output unit o calculate its error term
Where, Vo is the value of node k

Initialization network weights

Until the termination:
{
For each training example:
{
Propagate the input forward to the network and compute the observed outputs.
Propagate the errors backward as follows:
For each network output unit o calculate its error term
Where, Vo is the value of node k
Back propagation algorithm:
For each hidden unit calculate its error term,
Where, Vk is the value of node k Finally, update each weight
Where,
}
}
For each hidden unit calculate its error term,
Where, Vk is the value of node k Finally, update each weight
Where,
}
}
Fonts:
TABLE I. . FONTS USED FOR TESTING
Important parameters in the above algorithm:
Two most important parameters in BACKPROPAGATION are the learning rate and the momentum.

Learning Rate: in the above algorithm the weights of the nodes are updated by:
Shruti
Candara
Times New Roman
Carlito
Consolas
System
Book Antiqua
Below are a few samples of test dataset
System Font
Canadra Font
BookAntiqua Font
Fig. 8. Image used for testing
,
this parameter is called the learning rate.

Momentum: The weight update part of the equation can be modified in a way that the update in nth iteration also affected by the previous iteration in multiples of what is called a momentum factor. By adding this term to the formula, the update rule will be: . Momentum
takes values in the range 0 < 1.


MathematicalProcessing (Compiling + Evaluation)
For compiling a mathematical expression in we use a classical calculator stack approach that pushes in the operators and operands one by one into a stack and once everything is in the stack, the stack is evaluated one operator at a time by pushing and popping as per the precedence and associativity of the operator. In case of any error, the process is stopped and error is returned.
III. STANDARD DATASET FOR EXPERIMENT
For testing of the above mentioned procedure, a dataset consisting of 7 different fonts was used.
All the mentioned fonts were tested for characters in ASCII sequence of 32 (space) to 126 (~)
With the setup as mentioned above and 7 datasets, the results
were quite impressive. Efficiency pars the OCR process at approximately 91% and the Mathematical evaluation is quite effective to calculate all the operators thrown at it.
Also it is important to note that for certain fonts with similar looking characters, the error rate was high. For Example, for fonts Broadway & Britannic O (capital O), & 0(Zero) looked almost same, so were erroneous.
Britannic Font confusion for Zero & O
Broadway Font confusion for Zero & O
Fig. 9. Confusion for Zero & O in different fonts
Procedure for the app:
Step 1 + 2 + 3 + 4:

Step 5: Check output
Step 1: Open the application
Step 3: Crop the image to select proper expression
Step 2: Select the expression image (capture via camera or browse in the gallery)
Step 4: Wait for the processing
Fig. 11. App screenshots (2)
TABLE II. . EXPERIMENTAL RESULTS: INPUT IMAGE AND EVALUATED EXPRESSIONS
Image 
Detected Expression 
Evaluated Expression 
2^2 
4 

2^2 
4 

abs(90) 
90 

loglo( 100 ) 
Error: no operator loglo 

log10(100) 
2 
Image 
Detected Expression 
Evaluated Expression 
2^2 
4 

2^2 
4 

abs(90) 
90 

loglo( 100 ) 
Error: no operator loglo 

log10(100) 
2 
Fig. 10. App screenshots (1)
max(100,56) 
100 

PI*10*10 
314 

Pi*10*10 
314 

sin(90) 
1 

tan(90) 
16331239350 000000 infinite 

tanh(90) 
1 
Efficiency
100.00
90.00
80.00
70.00
60.00
50.00
40.00
30.00
20.00
10.00
0.00
5 10 20 30 40 50 60 70 78
Number of Fonts in dataset
Efficiency
Efficiency
100.00
90.00
80.00
70.00
60.00
50.00
40.00
30.00
20.00
10.00
0.00
5 10 20 30 40 50 60 70 78
Number of Fonts in dataset
Efficiency
Efficiency
Efficiency
Fig. 12. Experimental results: Input image and Evaluated Expression
With increase in the fonts in the dataset, we are able to achieve a stable efficiency rate of about 90% to 92%. With increase in font for a certain amount, the efficiency went on to plateau and further more increase led to decline of efficiency due to ambiguity in classification.
REFERENCES

P. P. Roy, J. Llados, U. Pal, Text/graphics separation in color maps, in: Computing: Theory and Applications, 2007. ICCTA' 07. International Conference on, IEEE, 2007, pp. 545551.

Li Peng, Junhua Li ,A Facial Expression Recognition Method Based on Quantum Neural Networks, Publication: ISKE2007, October 2007

R. Singh, C. Yadav, P. Verma, V. Yadav, Optical character recognition (ocr) for printed devnagari script using artifcial neural network, international Journal of Computer Science & Communication 1 (1) (2010) 9195.

M. S. Uddin, T. Rahman, U. S. Busra, M. Sultana, Automated extraction of text from images using morphology based approach, proceeding of international journal of electronics & informatics 1 (1).

F. Gounther, S. Fritsch, neuralnet: Training of neural networks, The R journal 2 (1) (2010) 3038.

C. M. Bishop, Neural networks for pattern recognition, Oxford university press, 1995.

S. B. Maind, P. Wankar, Research paper on basic of artificial neural network, International Journal on Recent and Innovation Trends in Computing and Communication 2 (1) (2014) 96100.

C. Peterson, T. Rognvaldsson, L. Lonnblad, Jetnet 3.0a versatile artifcial neural network package, Computer Physics Communications 81 (1) (1994) 185220.

K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural networks 4 (2) (1991) 251257.

R. Lippmann, An introduction to computing with neural nets, IEEE Assp magazine 4 (2) (1987) 422.