Deep Learning Algorithems for Breast Cancer Image Classification

DOI : 10.17577/IJERTCONV8IS03011
Download Full-Text PDF Cite this Publication
Text Only Version

 

Deep Learning Algorithems for Breast Cancer Image Classification

T. Sathyapriya

Research Scholar, Department of Computer Science

Vivekanandha College of Arts and Science for Women, (Autonomous), Namakkal,India.

Abstract – Breast Cancer is one of the most major reasons for death among ladies between the age of 30 to 45. The early detection method to identify the breast cancer is mammography. Many research has been done on the diagnosis and detection of breast cancer using various image processing and classification techniques[1]. Since the cause of breast cancer stays unclear, prevention becomes difficult. Thus, early detection of cancer in breast is the only way to cure breast cancer. Using CAD (Computer Aided Diagnosis) on mammographic image is the most appropriate and easiest way to diagnosis for breast cancer. Accurate discovery can effectively reduce the death rate. Masses and micro calcifications clusters are an important early symptoms of possible breast cancers. They can help predict breast cancer at its early state. CAD is being utilized and requested by radiologist that help them in making an perfect diagnosis and helps to improve outcome predictions. The involvement of digital image classification allows the doctor and the physicians a second opinion, and it saves the doctors and physicianstime. Importance given on the Convolutional Neural Network (CNN) method for breast image classification. Along with the CNN method with the conventional Neural Network (NN), Logic Based classifiers such as the Random Forest (RF) algorithm, Support Vector Machines (SVM), Bayesian methods, and a few of the semisupervised and unsupervised methods which have been used for breast image classification[2].

  1. INTRODUCTION

    Breast cancer is the most common cancer in women worldwide; according to the World Health Organization. Different types of cancer can be created in human body; among them breast cancer creates a serious health issues. Due to the structure of the human body, women are more in danger to breast cancer than men. There are different reasons for breast cancer, age, family history, breast density, obesity, and alcohol intake are reasons for breast cancer. Statistics says that the growth rate of breast cancer increases drastically. Figure1shows the number of females newly facing Breast Cancer as well as the number of females dying from the year 2007 in Australia. This is the situation of Australia (population 2025 million), but it can be used as a figure of the Brest Cancer situation of the whole world.

    Dr. T. Ramaprabha

    Professor, Department of Computer Science

    Vivekanandha College of Arts and Science for Women, (Autonomous), Namakkal, India

    Figure 1. New cases of breast cancer for women and number of women dying in the last twelve years.

    Early diagnosis is the first step in proper treatment of any disease. Nowadays it is immediate need for best pre- screening tool to spot the abnormality of the mammogram images in the earlier stage itself to identify the breast cancer. Manual investigation of this kind of images largely depends on the expertise of the doctors and physicians. As humans are error prone, so even an specialist can give wrong information about the diagnostic images. Computer Aided Diagnosis (CAD) techniques are largely utilized for biomedical image analysis such as cancer identification and classification. The use of CAD allows the patient and doctor to take a second opinion.

    Breast cancer tumors can be categorized into two broad categories:

    1. Benign (Noncancerous): Benign cases are considered as noncancerous, that is, non-life- threatening. But on a few occasions it could turn into a cancer status. An immune system known as sac normally segregates benign tumors from other cells and can be easily removed from the body[1].
    2. Malignant (Cancerous): Malignant cancer starts from an abnormal cell growth and might rapidly spread or attack nearby tissue. Normally the nuclei of the malignant tissue are much bigger than in normal tissue, which can be life-threatening in future stages. Cancer is always a life-threatening disease. Proper treatment of cancer saves peoples lives. Identification of the normal, benign, and malignant tissues is a very important step for further treatment of cancer. For the identification of benign and malignant conditions, imaging of the targeted area of the body helps the doctor and the physician in further diagnosis.

    Figure 2.The left side represents the Benign and right side Malignant images

  2. Breast Image Classification

    Various algorithms and investigation methods have been used by researchers to investigate breast images from different perspectives depending on the demand of the disease, the status of the disease, and the quality of the images. Among the different tasks, for breast image classification, machine learning (ML) and the Artificial Intelligence (AI) are heavily utilized. A general breast image classifier consists of four stages:

    1. Selection of a breast database
    2. Feature extraction and selection
    3. Classifier model
    4. Performance measuring parameter
    5. Classifier output.
        1. Available Breast Image Databases.

          Doctors and physicians are heavily reliant on the ultrasound, MRI, X-ray, and so forth images to find the breast cancer present status. However, to ease the doctors work, some research groups are investigating how to use computers more reliably for breast cancer diagnostics. To make a reliable decision about the cancer outcome, researchers always base their investigation on some well- established image database. Various organizations have introduced sets of images databases which are available to researchers for further investigation.

          The MIAS, DDSM, and Inbreast databases contain mammogram images. According to the Springer (http://www.springer.com), Elsevier (https://www.elsevier.com), and IEEE (http://www.ieeexplore.ieee.org) web sites, researchers have mostly utilized the MIAS and DDSM databases for the breast image classification research[1].

        2. Feature Extraction and Selection.

          Feature extraction is a very vital process for the overall system performance in the classification of micro calcifications. The features extracted are distinguished according to the extraction method and the image characteristics. The features which are implemented here is texture features and statistical measures like Mean, Standard deviation, Variance, Smoothness, Skewness, Uniformity, Entropy and kurtosis[1].

          Features which are extracted for classification do not always carry the same importance. Some features may even contribute to degrading the classifier performance. Prioritization of the feature set can reduce the classifier model complexity and so it can reduce the computational time. Feature set selection and prioritization can be classified into three broad categories:

          1. Filter: the filter method selects features without evaluating any classifier algorithm.
          2. Wrapper: the wrapper method selects the feature set based on the evaluation performance of a particular classifier.
          3. Embedded: the embedded method takes advantage of the filter and wrapper methods for classifier construction.
        3. Classifier Model.

          Based on the learning point of view, breast image classification techniques can be categorized into the following three classes [6]:

          1. Supervised
          2. Unsupervised
          3. Semisupervised.

            These three classes can be spli into Deep Neural Network (DNN) and conventional classifier (without DNN)

        4. Performance Measuring Parameter.

      A Confusion Matrix is a two-dimensional table which is used to a give a visual perception of classification experiments [7]. The (i,j)th position of the confusion table indicates the number of times that the ith object is classified as the jth object.

      Hypothesized class

      True Class

      True Class

       

      Benign

      Benign

       

      Benign Malignant

      True positive (A) False negative (B)
      False positive (C) True negative (D)

      Malign ant

      Malign ant

       

      Figure 3: Confusion Matrix.

      The diagonal of this matrix indicates the number of times the objects are correctly classified. Figure 3 shows a graphical representation of a Confusion Matrix for the binary classification case.

      Among the different classification performance properties, this matrix will provide following parameters:

      1. Recall is defined as Recall = TP/( TP+ FN).
      2. Precision is defined as Precision = TP/( TP+ FP).
      3. Specificity is defined as Specificity = TN/( TN+ FP).
      4. Accuracy is defined as ACC =(TP+ TN)/(TP+ TN+FP+ FN).
      5. F -1 score is defined asF1=(2×Recall)/(2 × Recall + FP+ FN).
      6. Matthew Correlation Coefficient (MCC): MCC is a performance parameter of a binary classifier, in the range

      { 1to+1} . If the MCC values trend more towards +1, the classifier gives a more accurate classifier and the opposite condition will occur if the value of the MCC trend towards the 1.MCCcanbedeined as MCC

      = TP× TN FP× FN

      (TP+ FP)( TP+ FN)( TN+ FP)( TN+ FP)

  3. CLASSIFIER MODEL ON BREAST IMAGES

    In supervised learning, a general suggestion is established based on externally supplied instances to produce future prediction. For the supervised classification task, features are identified or automatically crated from the available dataset and each sample is mapped to a dedicated class. With the help of the features and their levels a hypothesis is created. Based on the hypothesis unknown data are classified [10].

    In general, the whole dataset is separated into training and testing parts. To test the data, some time data are also separated into a validation part as well. After the data separation, the most important part is to find out the appropriate features to classify the data with the utmost Accuracy.

      1. Convolutional Neural Networks

        Convolutional neural networks are deep artificial neural networks which comes from the concept of working principles of human brain . We use CNN to classify images. It can be used to identify faces, individual, street signs, tumors, platypuses and many other aspects of visual data. The convolutional layer is the major building block of a CNN. A single-layer perceptron linearly combines the input signal and gives a results based on a threshold function. Based on the working principle and with some advanced mechanism and engineering, NN methods have established a strong footprint in many problem-solving issues.

        During the forward pass, each filter is convolved across the width and height of the input volume , computing the dot product, and producing a 2-dimensional activation map of that filter. As a result, the network learns about the filters. The filter activates when they see some specific type of feature at some spatial position in the input. CNN has also fully connected layer that classifies output with one label per node.

        Figure 4: A generalized supervised classifier model

      2. Deep Neural Network

    Deep neural networks follow the structure of a typical articial neural network with a complex network model. It has nhidden layers and processes the data from the previous layer called as the input layer, and after every epoch, error rate of the input data will be gradually reduced by adjusting the weights of every node, back propagating the network and continues till reaches better results. Any number of inputs can be assigned as input nodes in input layer. Normally, number of nodes in DNN will be more than the input layer to increase the learning process

    intensively. Number of outputs can be dened individually as unique output nodes in output

    layer.

      1. Logic Based Algorithm

        Logic Based algorithms allow us to create more than one tree and merge the decisions of those trees for an advanced result; this mechanism is known as an ensemble method.

        RF classifier is a new and significant tree-based model which depends upon the integrated on the tree of predictors, so that each tree is dependent on values of arbitrary vector that undergoes sampling independently and with the similar distribution for each tree known as RF. It included integration of separate base classifiers, where all tree is introduced using a random vector sampled in an independent way from the classifier input vector for activating a rapid production of tree. To classify data, the classification individual vote from all trees is integrated with the assistance of the applied rule based model.

      2. Support Vector Machine (SVM)

    Support vector machines (SVMs), first introduced by Vapnik have shown their effectiveness in many pattern recognition problems [15], and they can provide better classification performances than many other classification techniques.

    An SVM classifier performs binary classification, i.e., it separates a set of training vectors for two different classes (x1, y1), (x2, y2),, (xm, ym), where xi Rd denotes vectors in a d-dimensional feature space and yi {-1, +1} is a class label. The SVM model is generated by mapping the input vectors onto a new higher dimensional feature space denoted as : Rd Hf where d < f. Then, an optimal separating hyperplane in the new feature space is constructed by a kernel function K(xi,xj), which is the product of input vectors xi and xj and where K(xi,xj) = (xi) · (xj)[16].

  4. CONCLUSION

In this current era, lots of people are facing many problems with present age diseases. Breast cancer is one of the most common types of dangerous disease increasing over time among different countries. Lack of alertness and post- identication of disease will be the major reason for more death rates. Computer-aided diagnosis will be a correct solution for all kind of peoples to diagnose with exact results.

CAD system will not be a perfect substitute for professional doctors, but this aid will help them a lot, by assisting practitioners, to make a correct decision by studying patient reports. Sometimes, practitioners may do some fault due to lack of practice or poor analysis of reports. So, it will give a better clarity for the current medical environment. It can help the patient to receive the timely

feedback about the disease which can improve the patient- management scenario.

REFERENCES

  1. PRANNOY GIRI, K SARAVANAKUMAR, Breast Cancer Detection using Image Processing Techniques, Orient. J. Comp. Sci. & Technol., Vol. 10(2), 391-399 (2017).
  2. Abdullah-Al Nahid, Yinan Kong, Involvement of Machine Learning for Breast Cancer Image Classification: A Survey, Hindawi,

    Computational and Mathematical Methods in Medicine Volume 2017.

  3. Abdullah-Al Nahid, Yinan Kong, Histopathological Breast-Image Classification Using Local and Frequency Domains by Convolutional Neural Network, Information 2018, 9, 19; doi:10.3390/info9010019.
  4. Duraisamy S, Emperumal S, Computer-aided mammogram diagnosis system using deep learning convolutional fully complex-valued relaxation neural network classifier. Deep Learning in Computer Vision, 2017;11(8):65662. https://doi.org/10.1049/iet- cvi.2016.0425.
  5. Samala RK. Multi-task transfer learning deep cnvolutional neural network: application to computer-aided diagnosis of breast cancer on mammograms. Physics Medical Biology 2017;62(23):8894 908.https://doi.org/10.1088/1361-6560/aa93d4.
  6. S.B.Kotsiantis, I.D.Zaharakis, and P.E.Pintelas,Machine learning: A review of classification and combining techniques, Artificial Intelligence Review, vol. 26, no. 3, pp. 159190, 2006.
  7. N. D. Marom, L. Rokach, and A. Shmilovici, Using the confusion matrix for improving ensemble classifiers, in Proceedings of the 2010 IEEE 26th Convention of Electrical and Electronics Engineers in Israel, IEEEI 2010,pp.555559,Israel,November 2010.
  8. Rangaraj M Rangayyan, Fabio J Ayres, and JE Leo Desautels, A review of computer-aided diagnosis of breast cancer: Toward the detection of subtle signs, Journal of the Franklin Institute, vol. 344, no. 3, pp. 312348, 2007.
  9. S. Naresh, S. Vani Kumari, Breast Cancer Detection using Local Binary Patterns, International Journal Of Computer Applications (0975- 8887), Volume 123- No.16, August 2015, pp. 6- 9.
  10. S. B. Kotsiantis, Supervised machine learning: a review of classification techniques, in Proceedings of the 2007 Conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI,Information Retrieval and Pervasive Technologies,pp.324, 2007.
  11. S. C. Tai, Z. S. Chen, and W. T. Tsai, An automatic mass detection system in mammograms based on complex texture features, Journal of Biomedical and Health Informatics, vol. 18, no. 2, pp. 618627, 2014.
  12. Z. Yang, M. Dong, Y. Guoa et al., A new method of microcalcifications detection in digitized mammograms based on improved simplified PCNN, Neurocomputing, vol. 218, pp. 7990, 2016
  13. J. N. Silva, A. O. C. Filho, A. C. Silva, A. C. Paiva, and M. Gattass, Automatic detection of masses in mammograms using quality threshold clustering, correlogram function, and SVM, Journal of Digital Imaging, vol. 28, no. 3, pp. 323337, 2015.
  14. Karen Simonyan and Andrew Zisserman, VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION , arXiv:1409.1556v5 [cs.CV] 23 Dec 2014 .
  15. Byun H., Lee S.-W, A survey on pattern recognition applications of support vector machines. International Journal of Pattern Recognition and Artificial Intelligence, 2003; 17(3): 459486.
  16. Min-Wei Huang,1 Chih-Wen Chen,2,3,* Wei-Chao Lin,4 Shih-Wen Ke,5 and Chih-Fong Tsai6, SVM and SVM Ensembles in Breast Cancer Prediction, PLoS One. 2017; 12(1): e0161501.
  17. Y. Chen, L. Ling, and Q. Huang, Classification of breast tumors in ultrasound using biclustering mining and neural network,in Proceedings of the 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2016 , pp. 17871791, China, October 2016.

Leave a Reply