COVID-AI: An Artificial Intelligence System to Diagnose COVID-19 Disease

—The COVID-19 pandemic is considered the most important global health disaster of the century and the greatest challenge to mankind since World War II. A new class of coronavirus, known as SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is responsible for the occurrence of this disease. COVID-19 disease is similar to pneumonia, so it is difficult to diagnose. Deep learning techniques are gaining increasing importance in the medical diagnosis field by their X-ray scan images. In this research, predictive models are proposed to help the diagnosis of COVID-19 and pneumonia using chest X-ray Images. To make it easy, the graphical user interface will display the severity of the disease on the x-ray image given by the user. Data augmentation has been used to increase the diversity of data without actually collecting new data. We have used different deep learning models such as VGG16, DenseNet121, ResNet50, Inception V3 and Xception models for disease classification and compared the results between those models and found that 98.8% Accuracy is obtained from the DenseNet model.


INTRODUCTION
The current outbreak of COVID-19, according to the report of the World Health Organization (WHO until 22 July 2020), has affected more than 14731563 people and killed more than 611284 people in more than 216 countries worldwide. There are no reports of any medically approved antiviral drugs or vaccines that are effective against COVID-19. It has spread rapidly throughout the world, posing enormous health, economic, environmental and social challenges for the entire human population. Coronavirus outbreaks are severely disrupting the global economy. Almost all nations are struggling to slow the transmission of the disease by testing and treating patients, quarantining suspected persons through contact tracing, restricting large gatherings, maintaining full or partial lockdown, and so on.
The symptoms of COVID-19 pneumonia may be as same as other types of viral pneumonia. Because of this, it can be difficult to tell what your condition is without testing for COVID-19 or other respiratory infections. Research is underway to determine how COVID-19 pneumonia differs from other types of pneumonia. Information from these studies can potentially help in diagnosis and advance our understanding of how SARS-CoV-2 affects the lungs. One study used CT scans and laboratory tests to compare the diagnostic features of COVID-19 pneumonia to other types of pneumonia.
Artificial Intelligence (AI) can improve traditional medical imaging methods like Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and X-ray by offering computational capabilities that process images with greater speed and accuracy, at scale. AI has the potential to improve medical imaging with higher automation, increased productivity, standardized processes and more accurate diagnosis. To identify the disease by computer vision and AI to analyze chest X-ray images, picking up patterns that would not normally be recognized with the naked eye. The conventional machine learning algorithms as wells as deep learning frameworks, a machine learning technique that teaches computers to learn by example, The AI model was able to predict results "with great accuracy," but could be improved further with the development of new algorithms. The research carried out in the project has led to some extremely promising results and we are looking to build on this success rapidly to help in the fight against COVID-19.
In this research paper, due to the unavailability of chest Xray images of Covid-19 patients, we used data augmentation to increase the amount of the image data. The role of data augmentation helps to train the model and achieve better accuracy. The chest x-ray images are used in the diagnosis of disease in the healthcare system. For the classification of chest X-ray images, we obtained better results using VGG 16, Densnet 121, Resnet 50, Inception V3, and the Xception model.

II. DATASET
Deep learning requires a lot of training data because of the large number of parameters required to be tuned by the learning algorithm. In the deep learning approach, the image data set split into training data, validation data and testing data. Then, training data sets used to train networks. After a network trained, a tested data set employed to evaluate network performance. Not all images available in the data set used, some poor quality images, non-x-ray images removed before the train network. The network performs well after removing poor quality, non-x-ray images from the datasets. In this research paper, we have acquired 4650 chest x-ray images from two different sources. After image data augmentation we get 5000 chest x-ray images, including 700 COVID (350 before image data augmentation), 2150 normal and 2150 of pneumonia patient's chest x-ray images.

B. Chest-xray-pneumonia [3]
The chest-xray-pneumonia dataset has 5863 x-ray images (JPEG) and 2 categories (pneumonia / normal). This dataset is publicly available on Kaggle, it proposed for the Kaggle competition. In this dataset, the chest X-ray images collected from a retrospective cohort of pediatric patients aged one to five years from Guangzhou Women and Children's Medical Center, Guangzhou. All chest X-ray imaging performed as part of routine clinical care of patients.

A. Data Preprocessing
Data preprocessing is a collection of techniques to improve training data. It is used to convert raw data into efficient and useful data. In this research, data collected from 2 different sources and then built into a single collection. The collection consists of 3 classes of chest X-ray (CRX) images (COVID, pneumonia, and normal).
First, we filtered out non-X-rays (CT images) from the collection. All good quality CRX images are resized to 224 x 224 pixels to train the model with a small number of parameters. Then using data augmentation [4] converted 350 CRX images of COVID patients into 700 CRX images with the help of a flipping algorithm. A total of 5000 CRX images are obtained after the data increment. Then split them into training, validation, test data with a ratio of (8: 1: 1).

B. Deep Learning Architectures
Convolutional Neural Networks (CNN) is the superset of neural networks in Deep Learning. Convolutional Neural Networks can perform complex tasks with images, sounds, texts, videos, etc. For Image classification problems, a multilayer perceptron (MLP) is mostly used and provides very good results, furthermore, the training time remains reasonable. In this research, we do not have millions of labeled data to train such complex models. Transfer learning has used to solve this problem. Transfer learning [5] is an approach to store the knowledge gained while solving a problem and apply it to a different but related problem. It can train deep neural networks with a comparatively little amount of data. We have used several CNN models and analyzed the performance to get the best results.

3) InceptionV3 : Inception-V3 is a type of conventional neural network introduced by Szegedy et al. as GoogLeNet [10]. The network is 48 layers deep and can classify images into 1000 classes. The input size of this model is fixed as 299 × 299. InceptionV3 uses batch normalization, distortion in images, RMSProp and is based on several small convolutions
to reduce the number of parameters. This is based on a multiscale approach with multiple classifier structure combined with multiple sources for backpropagation. This model increases both the width and the depth of the network without causing a penalty. In this model, multiple perception layers are applied to the convolution on the input feature map at different scales to allow more complex decision making.  [7] in 2017, is an extension of the Inception Architecture. The Xception is a linear stack of depthwise separable convolution layers with residual connections. The depthwise separable convolution is intended to reduce computational cost and memory requirements. Xception has 36 convolutional layers, which are structured into 14 modules, all consisting of linear residual connections except for the first and last modules. The separable convention in Xception separates the learning of channel-wise and spacewise features. In addition, the residual connection helps solve the issues of vanishing gradients and representational bottlenecks by creating a shortcut in the sequential network. This shortcut connection is making the output of an earlier layer available as an input to the later layer using summation operation instead of being concatenated.

C. Graphical User Interface (GUI)
In Deep Learning with Python, API like Tensorflow and Keras are allows us to develop predictive models programmatically, but it is difficult to use these models by all users. A Graphical User Interface (GUI) is a system, allows users to interact visually with computer programs or software. In this research, we have used the Flask web framework to create a GUI and then hosted it on the cloud. The GUI helps all users to easily use these models and predict the Covid-19 or pneumonia disease by uploading chest x-ray images. The URL of the GUI is https://covid-ai-app.herokuapp.com and snapshots of GUI shown in Fig. 7.

IV. EXPERIMENTAL RESULTS
In the present study, we first remove the non-x-ray images from the data. Then, we applied the data augmentation with the flip algorithm to the COVID chest x-ray images. After this step, we had 5000 CRX images. We used the five pre-trained CNN models with weights of ImageNet. Then fine-tuned each model with for 50 epochs with transfer learning. The batch size is set to 32, and the ADAM optimizer used to optimize the loss, with a learning rate of 0.001. All images are downsampled to 224x224 before fed to the neural network (as these pre-trained models are usually trained with a specific image resolution). All our implementations are done with Tensorflow API. The accuracy of the predictive models on the Test data, shown in Table 1. We obtained the final results of these trained models in terms of the Confusion Matrix (a table used to describe the performance of a classification model) and Classification Report (used to measure the quality of predictions from the models). The confusion matrix and classification report of the proposed models, shown in Fig. 8, Fig. 9, Fig. 10, Fig. 11 and Fig. 12. V. CONCLUSION Corona Virus affects different parts of the body such as the respiratory system (such as the lungs), heart and blood vessels, kidneys, gut, and brain. In some patients, the virus spreads its deadly network to many organs. If the virus is not detected at an early stage, it invades the lower respiratory tract of the body. The mechanism we proposed in this paper is an automated system based on different deep learning models. The accuracy of the results obtained by each model is based on the confusion matrix and classification report. From Table  1 the overall accuracy for the proposed system is 98.8% which is obtained from DenseNet121. Through the proposed model, we are not only able to detect COVID disease but through this proposed model we can also detect fatal diseases like Pneumonia. This model can be useful for rapidly analyzing medical data in the field of medicine.