Traditional Machine Learning Method vs Neural Networks (on the MNIST Handwritten Dataset)

Arish Izhar

doi:10.17577/IJERTV10IS090122

Volume 10, Issue 09 (September 2021)

Traditional Machine Learning Method vs Neural Networks (on the MNIST Handwritten Dataset)

DOI : 10.17577/IJERTV10IS090122

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 244
Authors : Arish Izhar
Paper ID : IJERTV10IS090122
Volume & Issue : Volume 10, Issue 09 (September 2021)
Published (First Online): 22-09-2021
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Traditional Machine Learning Method vs Neural Networks (on the MNIST Handwritten Dataset)

Arish Izhar

Computer Science

Birla Institute of Technology Mesra, Ranchi New Delhi, India

Abstract In this paper we compare the accuracies of solving the task of number identification of the MNIST handwritten dataset with two types of modelling approaches. On the one hand, we use well known traditional machine learning classification which is Logistic Regression in this case; and on the other hand we use Artificial Neural Networks to do the same. Traditional Machine Learning algorithms tend to perform at the same level when the data size increases but ANN outperforms traditional Machine Learning algorithms. Three different groups of models are trained. For the entire dataset, to detect 2 and to detect not 2. In addition we add another hidden layer to the neural networks to offer more insight.

INTRODUCTION

The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28×28 pixel bounding box and anti- aliased, which introduced grayscale levels.

The MNIST database contains 60,000 training images and 10,000 testing images. Half of the training set and half of the test set were taken from NIST's training dataset, while the other half of the training set and the other half of the test set were taken from NIST's testing dataset.
METHOD
1. Methodology Overview
  
  The MNIST data is freely available online. It is split into 60,000 training images and 10,000 testing images. The data was flattened and shuffled before it was used for both the Logistic Regression classifier as well as the Neural network. The dataset was also shuffled before being used to train the two models. The code was written using Sklearn, Tensorflow and Keras in Python
  
  In order to gain a better insight into the difference between the methods, three different sets of models were trained for the dataset. One set was a general model for the entire dataset, the second was to identify just the number 2 and the third was a model to identify not 2. In addition, another neural network with a hidden layer was trained in each set.
  
  The sets are as follows-
  1. Set 1 –
    1. Logistic Regression classifier on the entire dataset
    2. Neural Network trained on the entire dataset
  2. Set 2 –
    1. Logistic Regression classifier to find 2
    2. Neural Network trained to identify 2
  3. Set 3 –
    1. Logistic Regression classifier to find not 2
    2. Neural Network trained to find not 2
      
      Additionally, each model had K-fold validation done on it with three folds. To evaluate the performance of any machine learning model we need to test it on some unseen data. Based on the models performance on unseen data we can say whether our model is Under-fitting/Over-fitting/Well generalized. This is the reason we have used K-fold cross-validation.
      
      Once the models are trained we then find out their accuracy and find the confusion matrix along with our performance metrics such as precision, recall and f1 score. The same was done for the cross-validation score and cross-validation predictions.
2. Logistic Regression Model
  
  The logistic regression classifier was trained using the lbfgs solver and a tolerance of 0.1.
3. Neural Network Model
The neural networks were trained using the sigmoid activation function, the adam optimizer and the sparse_categorical_crossentropy loss function and five epochs. They had 10 dense units/neurons.

The neural network with the hidden layer had an additional layer with 100 dense units using the relu activation function.
RESULTS

Before you begin to format your paper, first write and save the content as a separate text file. Keep your text and graphic files separate until after the text has been formatted and styled. Do not use hard tabs, and limit use of hard returns to only one return at the end of a paragraph. Do not add any kind of pagination anywhere in the paper. Do not number text heads- the template will do that for you.

Finally, complete content and organizational editing before formatting. Please take note of the following items when proofreading spelling and grammar:
1. Results for the entire dataset Regression Model Accuracy = 0.9255
  
  Classification Report
  
  Neural Network
  
  Accuracy = 0.9258999824523926
  
  Classification Report
  
  Neural Network with a hidden layer
  
  Accuracy score = 0.9739999771118164
  
  Classification Report
  
  Regression Model Cross-Validation Scores
  
  Accuracy =
  
  0.9172333333333
  
  Classification Report
  
  Neural Network Cross Validation Scores
  
  Accuracy = 0.9191500147183737
  
  Classification Report
  
  Neural Network with a hidden layer Cross- Validation Scores
  
  Accuracy =
  
  0.9242333372433981
2. Results for the 2 Detector
  
  Regression
  
  Classification Report
  
  Model
  
  Accuracy = 0.9802
  
  Classification Report
  
  Neural Network
  
  Accuracy = 0.9811999797821045
  
  Classification Report
  
  0.9954000115394592
  
  Neural Network with a hidden layer
  
  Accuracy =
  
  Classification Report
  
  Regression
  
  Model Cross- Validation Scores
  
  Accuracy = 0.9781
  
  Classification Report
  
  0.9799000024795532
  
  Neural Network Model Cross- Validation
  
  Scores
  
  Accuracy =
  
  Precision Recall Curve
  
  Classification Report
  
  0.9799166719118754
  
  Neural Network with a hidden layer
  
  Model Cross- Validation Scores
  
  Accuracy =
  
  Classification Report
3. Results for the not 2 Detector
Regression

Model

Accuracy = 0.9802

Classification Report

Neural Network Model

Accuracy = 0.9810000061988831

Classification Report

0.9948999881744385

Neural Network with a hidden layer Model

Accuracy =

Classification Report

Regression Model Cross Validation Scores

Accuracy = 0.9781

Classification Report

Neural Network Model Cross Validation Scores

Accuracy = 0.9802666505177816

Precision-Recall Curve

Classification Report

Neural Network with a hidden layer Model Cross Validation

Scores

Accuracy = 0.9976666569709778

Classification Report
CONCLUSION

After the text edit has been completed, the paper is ready for the template. Duplicate the template file by using the Save As command, and use the naming convention prescribed by your conference for the name of your paper. In this newly created file, highlight all of the contents and import your prepared text file. You are now ready to style your paper; use the scroll down window on the left of the MS Word Formatting toolbar.

For the full dataset classifiers

Loking at the accuracy scores

The neural network is 0.0432177% better than the regression classifier

The Neural network with a hidden layer is 5.2403% better than the regression classifier

The neural network with a hidden layer is 5.19484% better than the neural network with no hidden layers

Now looking at the cross-validation accuracies

The neural network is 0.208965% better than the regression classifier

The Neural network with a hidden layer is 0.763165% better than the regression classifier

The neural network with a hidden layer is 0.553044% better than the neural network with no hidden layers
For the detect 2 classifiers

Looking at the accuracy scores

The neural network is 0.101918% better than the regression classifier

The Neural network with a hidden layer is 1.5507% better than the regression classifier

The neural network with a hidden layer is 1.44731% better than the neural network with no hidden layers

Now looking at the cross-validation accuracies

The neural network is 0.18403% better than the regression classifier

The Neural network with a hidden layer is 0.185666% better than the regression classifier

The neural network with a hidden layer is 0.00163282% better than the neural network with no hidden layers
For the detect not 2 classifiers

Looking at the accuracy scores

The neural network is 0.081616% better than the regression classifier

The Neural network with a hidden layer is 1.49968% better than the regression classifier

The neural network with a hidden layer is 1.41691% better than the neural network with no hidden layers

Now looking at the cross-validation accuracies

The neural network is 0.221511% better than the regression classifier

The Neural network with a hidden layer is 2.00047% better than the regression classifier

The neural network with a hidden layer is 1.77503% better than the neural network with no hidden layers

In each case, we can see that the Neural network with a single hidden layer seems to be better than the regression classifier while the neural network without any hidden layer is also better but only to a very small degree.

Traditional Machine Learning Method vs Neural Networks (on the MNIST Handwritten Dataset)

Leave a Reply