COVID-AI: An Artificial Intelligence System to Diagnose COVID-19 Disease

Download Full-Text PDF Cite this Publication

Text Only Version

COVID-AI: An Artificial Intelligence System to Diagnose COVID-19 Disease

Somil Vasal

Department of Computer Science & Engineering,

Gyan Ganga Institute of Technology and Science, Jabalpur, India

Sourabh Kumar Jain

Department of Computer Science & Engineering,

Gyan Ganga Institute of Technology and Science, Jabalpur, India

Ashok Verma

Department of Computer Science & Engineering,

Gyan Ganga Institute of Technology and Science, Jabalpur, India

AbstractThe COVID-19 pandemic is considered the most important global health disaster of the century and the greatest challenge to mankind since World War II. A new class of coronavirus, known as SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is responsible for the occurrence of this disease. COVID-19 disease is similar to pneumonia, so it is difficult to diagnose. Deep learning techniques are gaining increasing importance in the medical diagnosis field by their X- ray scan images. In this research, predictive models are proposed to help the diagnosis of COVID-19 and pneumonia using chest X-ray Images. To make it easy, the graphical user interface will display the severity of the disease on the x-ray image given by the user. Data augmentation has been used to increase the diversity of data without actually collecting new data. We have used different deep learning models such as VGG16, DenseNet121, ResNet50, Inception V3 and Xception models for disease classification and compared the results between those models and found that 98.8% Accuracy is obtained from the DenseNet model.

KeywordsAI; Deep Learning; Covid-19; Classification; Graphical User Interface; Chest X-Ray; Transfer Learning;


    The current outbreak of COVID-19, according to the report of the World Health Organization (WHO until 22 July 2020), has affected more than 14731563 people and killed more than 611284 people in more than 216 countries worldwide. There are no reports of any medically approved antiviral drugs or vaccines that are effective against COVID-

    19. It has spread rapidly throughout the world, posing enormous health, economic, environmental and social challenges for the entire human population. Coronavirus outbreaks are severely disrupting the global economy. Almost all nations are struggling to slow the transmission of the disease by testing and treating patients, quarantining suspected persons through contact tracing, restricting large gatherings, maintaining full or partial lockdown, and so on.

    The symptoms of COVID-19 pneumonia may be as same as other types of viral pneumonia. Because of this, it can be difficult to tell what your condition is without testing for COVID-19 or other respiratory infections. Research is underway to determine how COVID-19 pneumonia differs from other types of pneumonia. Information from these studies can potentially help in diagnosis and advance our understanding of how SARS-CoV-2 affects the lungs. One study used CT scans and laboratory tests to compare the

    diagnostic features of COVID-19 pneumonia to other types of pneumonia.

    Artificial Intelligence (AI) can improve traditional medical imaging methods like Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and X-ray by offering computational capabilities that process images with greater speed and accuracy, at scale. AI has the potential to improve medical imaging with higher automation, increased productivity, standardized processes and more accurate diagnosis. To identify the disease by computer vision and AI to analyze chest X-ray images, picking up patterns that would not normally be recognized with the naked eye. The conventional machine learning algorithms as wells as deep learning frameworks, a machine learning technique that teaches computers to learn by example, The AI model was able to predict results with great accuracy, but could be improved further with the development of new algorithms. The research carried out in the project has led to some extremely promising results and we are looking to build on this success rapidly to help in the fight against COVID-19.

    In this research paper, due to the unavailability of chest X- ray images of Covid-19 patients, we used data augmentation to increase the amount of the image data. The role of data augmentation helps to train the model and achieve better accuracy. The chest x-ray images are used in the diagnosis of disease in the healthcare system. For the classification of chest X-ray images, we obtained better results using VGG 16, Densnet 121, Resnet 50, Inception V3, and the Xception model.


    Deep learning requires a lot of training data because of the large number of parameters required to be tuned by the learning algorithm. In the deep learning approach, the image data set split into training data, validation data and testing data. Then, training data sets used to train networks. After a network trained, a tested data set employed to evaluate network performance. Not all images available in the data set used, some poor quality images, non-x-ray images removed before the train network. The network performs well after removing poor quality, non-x-ray images from the datasets. In this research paper, we have acquired 4650 chest x-ray images from two different sources. After image data augmentation we get 5000 chest x-ray images, including 700 COVID (350 before image data augmentation), 2150 normal and 2150 of pneumonia patient's chest x-ray images.

    1. Covid-chestxray-dataset [1]

      The covid-chestxray-dataset [1] on Github is the first public COVID-19 chest X-ray image data collection. The collection collected 542 frontal chest X-ray images from 262 people from 26 countries, including 434 images of COVID-19 (SARS-CoV-2) patients. Currently, all images and data released under the GitHub repository [2].

    2. Chest-xray-pneumonia [3]

    The chest-xray-pneumonia dataset has 5863 x-ray images (JPEG) and 2 categories (pneumonia / normal). This dataset is publicly available on Kaggle, it proposed for the Kaggle competition. In this dataset, the chest X-ray images collected from a retrospective cohort of pediatric patients aged one to five years from Guangzhou Women and Childrens Medical Center, Guangzhou. All chest X-ray imaging performed as part of routine clinical care of patients.


    1. Data Preprocessing

      Data preprocessing is a collection of techniques to improve training data. It is used to convert raw data into efficient and useful data. In this research, data collected from 2 different sources and then built into a single collection. The collection consists of 3 classes of chest X-ray (CRX) images (COVID, pneumonia, and normal).

      First, we filtered out non-X-rays (CT images) from the collection. All good quality CRX images are resized to 224 x

      224 pixels to train the model with a small number of parameters. Then using data augmentation [4] converted 350 CRX images of COVID patients into 700 CRX images with the help of a flipping algorithm. A total of 5000 CRX images are obtained after the data increment. Then split them into training, validation, test data with a ratio of (8: 1: 1).

      Fig. 1. Data Preprocessing

    2. Deep Learning Architectures

      Convolutional Neural Networks (CNN) is the superset of neural networks in Deep Learning. Convolutional Neural Networks can perform complex tasks with images, sounds, texts, videos, etc. For Image classification problems, a

      multilayer perceptron (MLP) is mostly used and provides very good results, furthermore, the training ime remains reasonable. In this research, we do not have millions of labeled data to train such complex models. Transfer learning has used to solve this problem. Transfer learning [5] is an approach to store the knowledge gained while solving a problem and apply it to a different but related problem. It can train deep neural networks with a comparatively little amount of data. We have used several CNN models and analyzed the performance to get the best results.

      1. VGG16: It is a convolutional neural network consisting of 16 layers with five sets of small convolutional filters of 3×3 size. Karen Simonian and Andrew Zisserman

        [6] introduced the VGG-16 architecture in their paper Very Deep Contextual Network for Large Scale Image Recognition in 2014 and won the ILSVRC competition. VGG16 evaluated on the ImageNet dataset consisting of 1000 classes and 14 million images, achieving a 92.7% top-5 test accuracy. The input 224×224 is a fixed size of the RGB image. The convolution layers have a size of 3×3 filters, the five max- pooling layers are the size of 2×2, followed by 3 fully connected layers and the last layer the softmax layer. All hidden layers then undergo ReLu activation.

        Fig. 2. VGG16 Architecture

      2. ResNet50 : The deep residual network is often referred to as its abbreviation ResNet [8] was the most ground- breaking task in the deep learning society over the years. This architecture can train up to hundreds or thousands of layers and still maintain its performance. ResNet50 is a 50 layer residual network. There are significantly modified

        versions of this architecture with multiple layers such as ResNet 101 and ResNet 152. ResNet is a CNN architecture from the Microsoft team that won the ILSVRC competition in 2015 and surpassed human performance on the ImageNet dataset. ResNet50 is being used mostly for transfer learning as it gives a promising result. This powerful backbone model is used in many functions of computer vision, mainly due to the use of skip connections in connecting the output from the previous layer to the next layer. This can reduce the vanishing gradient problem in the training of neural networks.

        Fig. 3. ResNet Block

      3. InceptionV3 : Inception-V3 is a type of conventional neural network introduced by Szegedy et al. as GoogLeNet [10]. The network is 48 layers deep and can classify images into 1000 classes. The input size of this model is fixed as 299

        × 299. InceptionV3 uses batch normalization, distortion in images, RMSProp and is based on several small convolutions to reduce the number of parameters. This is based on a multi- scale approach with multiple classifier structure combined with multiple sources for backpropagation. This model increases both the width and the depth of the network without causing a penalty. In this model, multiple perception layers are applied to the convolution on the input feature map at different scales to allow more complex decision making.

        Fig. 4. Inception-V3

      4. Xception : The Xception Architecture introduced by François Chollet [7] in 2017, is an extension of the Inception Architecture. The Xception is a linear stack of depthwise separable convolution layers with residual connections. The depthwise separable convolution is intended to reduce computational cost and memory requirements. Xception has

        36 convolutional layers, which are structured into 14 modules, all consisting of linear residual connections except for the first and last modules. The separable convention in

        Xception separates the learning of channel-wise and space- wise features. In addition, the residual connection helps solve the issues of vanishing gradients and representational bottlenecks by creating a shortcut in the sequential network. This shortcut connection is making the output of an earlier layer available as an input to the later layer using summation operation instead of being concatenated.

        Fig. 5. Xception Architecture

      5. DenseNet121 : The densely connected convolutional network (DenseNet) has a convolutional neural network architecture that is state-of-the-art according to the classification results with the ImageNet validation dataset. Huang et al. [9] used a direct connection from each layer to every other layer in a feed-forward direction. Each layer in the network receives concatenation of the feature maps produced in the previous layers as inputs and implements nonlinear functions such as batch normalization, ReLU, and convolution or pooling. After the nonlinear function operation, the product feature maps of each layer used as inputs to every after layers. The concatenation operation is not effective when the size of the feature maps varies, thus the pooling operation is important by changing the size of the feature maps.

        Fig. 6. DenseNet Model

    3. Graphical User Interface (GUI)

    In Deep Learning with Python, API like Tensorflow and Keras are allows us to develop predictive models programmatically, but it is difficult to use these models by all users. A Graphical User Interface (GUI) is a system, allows

    users to interact visually with computer programs or software. In this research, we have used the Flask web framework to create a GUI and then hosted it on the cloud. The GUI helps all users to easily use these models and predict the Covid-19 or pneumonia disease by uploading chest x-ray images. The URL of the GUI is and snapshots of GUI shown in Fig. 7.

    Fig. 7. Snapshots of the Graphical User Interface


    In the present study, we first remove the non-x-ray images from the data. Then, we applied the data augmentation with the flip algorithm to the COVID chest x-ray images. After this step, we had 5000 CRX images. We used the five pre-trained CNN models with weights of ImageNet. Then fine-tuned each model with for 50 epochs with transfer learning. The batch size is set to 32, and the ADAM optimizer used to optimize the loss, with a learning rate of 0.001. All images are down- sampled to 224×224 before fed to the neural network (as these pre-trained models are usually trained with a specific image resolution). All our implementations are done with Tensorflow API. The accuracy of the predictive models on the Test data, shown in Table 1.


    S. No.

    CNN Architecture












    Inception V3











    We obtained the final results of these trained models in terms of the Confusion Matrix (a table used to describe the performance of a classification model) and Classification Report (used to measure the quality of predictions from the models). The confusion matrix and classification report of the proposed models, shown in Fig. 8, Fig. 9, Fig. 10, Fig. 11 and

    Fig. 12.

    Fig. 8. Confusion matrix and classification report of VGG16 model

    Fig. 9. Confusion matrix and classification report of ResNet50 model

    Fig. 10. Confusion matrix and classification report of Inception V3 model

    Fig. 11. Confusion matrix and classification report of Xception model

    Fig. 12. Confusion matrix and classification report of DenseNet121 model


Corona Virus affects different parts of the body such as the respiratory system (such as the lungs), heart and blood vessels, kidneys, gut, and brain. In some patients, the virus spreads its deadly network to many organs. If the virus is not detected at an early stage, it invades the lower respiratory tract of the body. The mechanism we proposed in this paper is an automated system based on different deep learning models. The accuracy of the results obtained by each model is based on the confusion matrix and classification report. From Table 1 the overall accuracy for the proposed system is 98.8% which is obtained from DenseNet121. Through the proposed model, we are not only able to detect COVID disease but through this proposed model we can also detect fatal diseases like Pneumonia. This model can be useful for rapidly analyzing medical data in the field of medicine.


  1. J. Cohen et al., COVID-19 Image Data Collection: Prospective Predictions Are the Future, arXiv:2006.11988v1, Jun 2020

  2. covid-chestxray-dataset, chestxray-dataset

  3. chest-xray-pneumonia,

  4. Agnieszka Mikoajczyk, Micha Grochowski. Data augmentation for improving deep learning in image classification problem, [International Interdisciplinary PhD Workshop (IIPhDW)],

  5. Lisa Torrey and Jude Shavlik, Transfer Learning, DOI: 10.4018/978-1-60566-766-9.ch011

  6. Simonyan, K. and Zisserman, A., Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556, 2014.

  7. F. Chollet, Xception: Deep learning with depthwise separable convolutions, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017

  8. K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, Dec 2015,

  9. Gao Huang, Z. Liu, L. Maaten, Q. Weinberger, Densely Connected Convolutional Networks, CPVR 2017,

  10. Christian Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the Inception Architecture for Computer Vision,, 2015

Leave a Reply

Your email address will not be published. Required fields are marked *