Traditional Machine Learning Method vs Neural Networks (on the MNIST Handwritten Dataset)

Download Full-Text PDF Cite this Publication

Text Only Version

Traditional Machine Learning Method vs Neural Networks (on the MNIST Handwritten Dataset)

Arish Izhar

Computer Science

Birla Institute of Technology Mesra, Ranchi New Delhi, India

Abstract In this paper we compare the accuracies of solving the task of number identification of the MNIST handwritten dataset with two types of modelling approaches. On the one hand, we use well known traditional machine learning classification which is Logistic Regression in this case; and on the other hand we use Artificial Neural Networks to do the same. Traditional Machine Learning algorithms tend to perform at the same level when the data size increases but ANN outperforms traditional Machine Learning algorithms. Three different groups of models are trained. For the entire dataset, to detect 2 and to detect not 2. In addition we add another hidden layer to the neural networks to offer more insight.

  1. INTRODUCTION

    The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28×28 pixel bounding box and anti- aliased, which introduced grayscale levels.

    The MNIST database contains 60,000 training images and 10,000 testing images. Half of the training set and half of the test set were taken from NIST's training dataset, while the other half of the training set and the other half of the test set were taken from NIST's testing dataset.

  2. METHOD

    1. Methodology Overview

      The MNIST data is freely available online. It is split into 60,000 training images and 10,000 testing images. The data was flattened and shuffled before it was used for both the Logistic Regression classifier as well as the Neural network. The dataset was also shuffled before being used to train the two models. The code was written using Sklearn, Tensorflow and Keras in Python

      In order to gain a better insight into the difference between the methods, three different sets of models were trained for the dataset. One set was a general model for the entire dataset, the second was to identify just the number 2 and the third was a model to identify not 2. In addition, another neural network with a hidden layer was trained in each set.

      The sets are as follows-

      1. Set 1 –

        1. Logistic Regression classifier on the entire dataset

        2. Neural Network trained on the entire dataset

      2. Set 2 –

        1. Logistic Regression classifier to find 2

        2. Neural Network trained to identify 2

      3. Set 3 –

        1. Logistic Regression classifier to find not 2

        2. Neural Network trained to find not 2

          Additionally, each model had K-fold validation done on it with three folds. To evaluate the performance of any machine learning model we need to test it on some unseen data. Based on the models performance on unseen data we can say whether our model is Under-fitting/Over-fitting/Well generalized. This is the reason we have used K-fold cross-validation.

          Once the models are trained we then find out their accuracy and find the confusion matrix along with our performance metrics such as precision, recall and f1 score. The same was done for the cross-validation score and cross-validation predictions.

    2. Logistic Regression Model

      The logistic regression classifier was trained using the lbfgs solver and a tolerance of 0.1.

    3. Neural Network Model

    The neural networks were trained using the sigmoid activation function, the adam optimizer and the sparse_categorical_crossentropy loss function and five epochs. They had 10 dense units/neurons.

    The neural network with the hidden layer had an additional layer with 100 dense units using the relu activation function.

  3. RESULTS

    Before you begin to format your paper, first write and save the content as a separate text file. Keep your text and graphic files separate until after the text has been formatted and styled. Do not use hard tabs, and limit use of hard returns to only one return at the end of a paragraph. Do not add any kind of pagination anywhere in the paper. Do not number text heads- the template will do that for you.

    Finally, complete content and organizational editing before formatting. Please take note of the following items when proofreading spelling and grammar:

    1. Results for the entire dataset Regression Model Accuracy = 0.9255

      Classification Report

      Neural Network

      Accuracy = 0.9258999824523926

      Classification Report

      Neural Network with a hidden layer

      Accuracy score = 0.9739999771118164

      Classification Report

      Regression Model Cross-Validation Scores

      Accuracy =

      0.9172333333333

      Classification Report

      Neural Network Cross Validation Scores

      Accuracy = 0.9191500147183737

      Classification Report

      Neural Network with a hidden layer Cross- Validation Scores

      Accuracy =

      0.9242333372433981

    2. Results for the 2 Detector

      Regression

      Classification Report

      Model

      Accuracy = 0.9802

      Classification Report

      Neural Network

      Accuracy = 0.9811999797821045

      Classification Report

      0.9954000115394592

      Neural Network with a hidden layer

      Accuracy =

      Classification Report

      Regression

      Model Cross- Validation Scores

      Accuracy = 0.9781

      Classification Report

      0.9799000024795532

      Neural Network Model Cross- Validation

      Scores

      Accuracy =

      Precision Recall Curve

      Classification Report

      0.9799166719118754

      Neural Network with a hidden layer

      Model Cross- Validation Scores

      Accuracy =

      Classification Report

    3. Results for the not 2 Detector

    Regression

    Model

    Accuracy = 0.9802

    Classification Report

    Neural Network Model

    Accuracy = 0.9810000061988831

    Classification Report

    0.9948999881744385

    Neural Network with a hidden layer Model

    Accuracy =

    Classification Report

    Regression Model Cross Validation Scores

    Accuracy = 0.9781

    Classification Report

    Neural Network Model Cross Validation Scores

    Accuracy = 0.9802666505177816

    Precision-Recall Curve

    Classification Report

    Neural Network with a hidden layer Model Cross Validation

    Scores

    Accuracy = 0.9976666569709778

    Classification Report

  4. CONCLUSION

After the text edit has been completed, the paper is ready for the template. Duplicate the template file by using the Save As command, and use the naming convention prescribed by your conference for the name of your paper. In this newly created file, highlight all of the contents and import your prepared text file. You are now ready to style your paper; use the scroll down window on the left of the MS Word Formatting toolbar.

  1. For the full dataset classifiers

    Loking at the accuracy scores

    The neural network is 0.0432177% better than the regression classifier

    The Neural network with a hidden layer is 5.2403% better than the regression classifier

    The neural network with a hidden layer is 5.19484% better than the neural network with no hidden layers

    Now looking at the cross-validation accuracies

    The neural network is 0.208965% better than the regression classifier

    The Neural network with a hidden layer is 0.763165% better than the regression classifier

    The neural network with a hidden layer is 0.553044% better than the neural network with no hidden layers

  2. For the detect 2 classifiers

    Looking at the accuracy scores

    The neural network is 0.101918% better than the regression classifier

    The Neural network with a hidden layer is 1.5507% better than the regression classifier

    The neural network with a hidden layer is 1.44731% better than the neural network with no hidden layers

    Now looking at the cross-validation accuracies

    The neural network is 0.18403% better than the regression classifier

    The Neural network with a hidden layer is 0.185666% better than the regression classifier

    The neural network with a hidden layer is 0.00163282% better than the neural network with no hidden layers

  3. For the detect not 2 classifiers

Looking at the accuracy scores

The neural network is 0.081616% better than the regression classifier

The Neural network with a hidden layer is 1.49968% better than the regression classifier

The neural network with a hidden layer is 1.41691% better than the neural network with no hidden layers

Now looking at the cross-validation accuracies

The neural network is 0.221511% better than the regression classifier

The Neural network with a hidden layer is 2.00047% better than the regression classifier

The neural network with a hidden layer is 1.77503% better than the neural network with no hidden layers

In each case, we can see that the Neural network with a single hidden layer seems to be better than the regression classifier while the neural network without any hidden layer is also better but only to a very small degree.

Leave a Reply

Your email address will not be published. Required fields are marked *