- Open Access
- Authors : Anjali Kshirsagar , Shubham More , Saurabh Kale , Rutuj Anturkar, Sushma Vispute
- Paper ID : IJERTV10IS010226
- Volume & Issue : Volume 10, Issue 01 (January 2021)
- Published (First Online): 03-02-2021
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
An Application of Predictive Analytics for Early Detection of Sepsis: An Overview
Pimpri Chinchwad College of Engineering Pune, India
Pimpri Chinchwad College of Engineering Pune, India
Pimpri Chinchwad College of Engineering Pune, India
Pimpri Chinchwad College of Engineering Pune, India
Prof. Sushma Vispute
Pimpri Chinchwad College of Engineering Pune, India
AbstractThis Sepsis, commonly called as blood poisoning, is really the adverse consequence of the body's response to an infection, which results in organ damage. It is recognized as a global health priority by the World Health Organization in 2017. Detecting sepsis at an earlier stage can save life and cut down financial expenses for the patient.Various studies were carried out for developing early prediction models for detecting sepsis in patients. Due to advancements in Machine Learning and Artificial Intelligence, these fields also have great application in the medical field. Machine learning algorithms are used to predict sepsis in advance and help people in getting proper medication. Different machine learning algorithms can diagnose or predict sepsis and thus can prevent the progression of sepsis.This work provides a comparative analysis of different Machine Learning algorithms applied on the dataset referred from Physionet website, which was made available freely for the challenge 2019. This work provides comparison of different parameters of algorithms which are accuracy,f1-score etc.
KeywordsSepsis, machine learning, diagnosing, prediction methods, organ failure, Multi Layer Perceptron, Gaussian Naive Bayes, K-Nearest Neighbor, Adaboost Classifier, Extra Tree Classifier, Linear Discriminant Analysis.
Organ dysfunctions associated with an infection are diagnosed as sepsis. Sepsis is the primary cause of death due to infection, especially if it is not noticed and is not treated immediately. This requires immediate intervention in case of recognition. Sepsis is a syndrome that is shaped by pathogenic factors and host factors that develop over time. What distinguishes the sepsis from infection is the presence of an abnormal or irregular host response and organ failure.
With the increased usage of Machine Learning in the field of medicine, the early prediction and treatment of many diseases are provided with these methods. Considering the learning, reasoning and decision making abilities which are the sub field of Machine Learning are inferred to be used in predicting early stages of sepsis disease and determining the sepsis level is assessed.
We studied different sepsis prediction and detection techniques that had used various machine learning techniques. This work will be beneficial for the beginners in this research area to get information about different work associated with sepsis.
Sepsis is a fatal condition, which affects at least 26 million people in the world every year that is resulted by an infection. For every 100,000 people, sepsis is seen in 149-240 of them and it has a mortality rate of 30% . According to a study done, the sepsis death rate in India is 213 for every 1,00,000 individuals. Sepsis is a condition that is associated with infection and it involves a disordered reaction leading to organ system failure. For the diagnosis of infection in the patient, the presence of infection in the lungs, the detection of bacterial growth or bacterial infection in the hemoculture of the patient during bacterial screening, the presence of intra abdominal infection, new antibiotic therapy and other infections are investigated. In order to diagnose the sepsis disease, presence of infection in the patient is determined.
In 2012, a work was published by E.Gultepe et al.  about clustering technique used for extracting features of sepsis. They have done experiments using extracted features with and without lactate levels and evolved a relationship between this using Bayesian networks. They excluded an important parameter – heart rate – from their experiment. Respiratory Rate(RR),White Blood Cells (WBC), temperature, Mean Arterial Pressure (MAP), lactate, length of hospital stay in days (LOS) and sepsis occurrences (SO) are the features used for this study.Among 1492 patients Electronic medical records, 233 cases of sepsis were used to construct Bayesian network. First network is constructed as BN1 having
temperature, RR, WBC, lactate, LOS, SO as parameters and BN2 is the second network with addition of MAP test to the above parameters. Because of the lowest loss function value, BGE scoring criterion was chosen for the learning method of first network BN1. BN2 is learned using BIC and BGE. Having a lower loss function value BIC is used for the final network. BN1 had lower estimated goodness of fit compared with BN2. The relationship between lactate levels and sepsis were obtained for the sepsis patients using Bayesian network. It was concluded that lactate levels may be predictive of the SIRS criteria.
Umut Kaya et al.  proposed a model using multi-layered artificial neural networks which help sepsis diagnosis. For the construction of artificial neural network models, feed forward back propagation network structure and Levenberg Marquardt training algorithm were used. The input and output variables for the model were the parameters which doctors use to diagnose the sepsis disease and determine the level of sepsis. The method used provided an alternative prediction model for the early detection of sepsis disease. By using the data on intensive care patients aged 18-65 years in Istanbul, the risk of catching Sepsis was tried to be predicted with the help of artificial neural networks. The inputs and outputs of the patients placed in the intensive care unit and diagnosed as Sepsis were generated by the definition of Sepsis in 2017 and by examining the algorithms and variables used by the doctors were used in the modeling of the artificial neural network. The modeled inputs for the early diagnosis of Sepsis were the parameters used by physicians to diagnose Sepsis and determine the level of Sepsis. In this model, a 99% training, test and accuracy values were obtained.
In 2018 a work published by Jyoti Thakur et al.  consisted of the use of binary logistic regression to develop and compare two prediction models using invasive and non- invasive parameters. The data for the study was taken from Medical Information Mart for Intensive care (MIMIC) III database. An Android application was developed to calculate the probability of sepsis after manually entering the independent parameter values. It was found that the prediction model developed from non-invasive parameters was equally efficient as compared to prediction model made from invasive parameters.
In a work presented by Manmay Nakhashi et al.  used Machine Learning algorithms to utilize Electronic Health Records to help doctors detect the onset of sepsis. Random Forest based ensemble machine learning technique was used to work on patients data, also called as vital sign inputs, from Intensive Care Unit. A combined classifier and early predictor approach was followed. The approach consisted of a classifier whose job is classifying when early prediction is not possible due to lack of data and an early predictor to predict the occurrence of sepsis depending on the information of patient which was received from previos recordings of vital sign inputs.
For building the models we used a dataset of 40336 patients Electronic Health Records  . The dataset used is very sparse and contains a lot of missing values. It contains physiological features and demographics of the patient admitted to the Intensive Care Unit.
Table 1. List of features available in dataset. 1.Vital Signs
HR Heart rate (beats per minute) O2Sat Pulse oximetry (%)
Temp Temperature (Deg C) SBP Systolic BP (mm Hg)
MAP Mean arterial pressure (mm Hg) DBP Diastolic BP (mm Hg)
Resp Respiration rate (breaths per minute) EtCO2 End tidal carbon dioxide (mm Hg)
2. Laboratory values
BaseExcess Measure of excess bicarbonate (mmol/L) HCO3 Bicarbonate (mmol/L)
FiO2 Fraction of inspired oxygen (%) pH N/A
PaCO2 Partial pressure of carbon dioxide from arterial blood (mm Hg)
SaO2 Oxygen saturation from arterial blood (%) AST Aspartate transaminase (IU/L) Alkalinephos Alkaline phosphatase (IU/L) Calcium (mg/dL)
Chloride(mmol/L) Creatinine (mg/dL)
Bilirubin_direct Bilirubin direct (mg/dL) Glucose Serum glucose (mg/dL)
Lactate Lactic acid (mg/dL)
Magnesium (mmol/dL) Phosphate (mg/dL) Potassium (mmol/L)
Bilirubin_total Total bilirubin (mg/dL) TroponinI Troponin I (ng/mL)
Hct Hematocrit (%) Hgb Hemoglobin (g/dL)
PTT partial thromboplastin time (seconds) WBC Leukocyte count (count*10^3/ÂµL) Fibrinogen (mg/dL)
BUN Blood urea nitrogen (mg/dL)
The dataset also contains demographic values such as age, gender, Hospital admission time and ICU length of stay. Most of the values of the vitals and others were filled with NAs because the measurements are done on the basis of needs. The NAs were filled with mean values from across the dataset since their presence can skew the predictions.
MACHINE LEARNING ALGORITHMS USED
This algorithm consists of different layers of multilayer perceptrons which are interconnected by a set of weighted connections. The three types of layers are:
Input Layer: This layer receives input from a source,which can be a database or any device.
Hidden Layer: It receives input only from the input layer or another hidden layer. This layer is hidden from the outside world.
Output Layer: This layer provides the final output and also connects the network to the outside world.
applies to the result of this summing junction called activation function. The output value of the perceptron is calculated on the basis of the result of the activation function.
Backward Pass: Error is calculated with respect to the desired output value for certain patterns at the output layer. This error is propagated backwards through a network enforcing a correction on the weights of all connections in the network. This technique is based on the observation that all perceptrons in the network have a shared responsibility for the error that has been calculated at the output layer.
The K-Nearest Neighbor (KNN) algorithm was found by Aha, Kibar & Albert in 1991. It is an evolutionary search and optimization technique to find the best solution to a problem. It is a conventional non-parametric classifier. There are three main factors which influence the performance of this algorithm:
The distance metric used to locate the nearest neighbors.
The distance rule used to derive an allocation from k-nearest neighbor.
The number of neighbors used to categorize the new sample.
A distance to measure between two data instances is required for sepsis identification with K- Nearest Neighbor. It is based on measuring the distances between the test data and each of the training data to decide final classification output. In KNN any incoming entry of the patient's data is checked/classified to the corresponding nearest point of the previously obtained point. Then, if the nearest point is septic or non-septic, the next entry is judged accordingly. There are three different KNN classification to calculate the distance namely; Euclidean, Manhattan, Minkowski distance functions. Euclidean and Manhattan are mostly used for continuous variables whereas Minkowski for categorical variables.
i=1 i i
i=1 i i
Distance Formulas: Euclidean = k (x – y )2
A Feed Forward Multilayer Perceptron has no cycles.
Manhattan = k
|x – y |
Perceptrons of two consecutive layers are fully connected. Signals can be propagated in two directions: one is function signals propagated forward i.e from input layer to output layer through hidden layers and the other one is error signals propagated backwards, i.e. from output layer to input layer through the hidden layers.
The type of learning used is error correction learning or supervised learning. Algorithm used is Backpropagation of Error Signals or Backprop algorithm. In this algorithm every iteration consists of two passes:
Forward Pass: Every perceptron calculates the weighted linear combination of all its inputs and
i=1 i i
i=1 i i
i=1 i i
Minkowski = ( k (|x – y |)q )1/q
The value of K is used as small and odd to break the ties (typically 1,3 or 5). Larger K values can help to reduce the effect of noisy datasets. Here, x & y are the point coordinates on the plane.
Gaussian NaÃ¯ve Bayes
Naive Bayes classifiers are collections of classification algorithms which are based on Bayes Theorem. It is a
supervised machine learning algorithm. This algorithm uses a training data set which has known target classes to predict the class of the expected instances.
Naive Bayes technique presupposes the occurrence or lack of distinct attributes that do not depend on the occurrence or lack of attributes in identical sets. Bayes Theorem is used in Naive Bayes classifier. This theorem helps to estimate membership probabilities for every class such as the probability that the given record or data point belongs to which particular class. The class having the maximum probability is considered as the most liable class. It is also known as Maximum Posteriori(MAP). Naive Bayes classifier is based on the conditional probabilities and on the binary classes(septic=1 and non-septic=0).
P(ci | fk) = P(fk | ci) * P(ci) / P(fk) .(1)
P(fk | ci) = P(fk | c1) * P(fk | c2) .(2)
In Equation(2),Variable k =1,2,..n where n represents the maximum number of features.Variable i=1,2 where 1 is for non sepsis class and 2 is for sepsis class.
In equation(1),P(ci | fk) is the probability feature value of fk being in class ci.
In equation(1),P(fk |ci) is the probability of
generating feature value fk given class ci and how to calculate it is given in equation(2).
In equation(1),P(ci) and P(fk) are probability of occurrence of class ci and probability of feature value fk occurring respectively.
The binary classification is performed based on Bayesian classification rule.
If P(c1|fk) > P(c2|fk) then the classification is C1. If P(c1|fk) < P(c2|fk) then the classification is C2.
Ci is the target class for classification in which C1 is the
negative class (non sepsis cases) and C2 is the positive class (sepsis cases).
Extra Tree Classifier
Extra Tree Classifier is a decision tree based learning technique. Similar to random Forest it randomizes some decisions and subsets of data to minimize over-learning from the data and overfitting.
Extra Tree Classifier is similar to Random Forest, where it builds multiple trees and splits nodes using a random subgroup of features. Extra Tree Classifier samples without replacement and nodes are randomly divided. Every Decision Tree in the Extra Tree Forest is built from the original training sample. A andom sample of k features from a feature set is provided to at each test node in each tree. Each decision tree needs to select the best feature to divide the data based on some mathematical principles(mostly time Gini Index). This random sample of features leads to the creation of many de-correlated decision trees.
Using forest structure to perform the feature selection, during the construction of the forest, for each feature, the normalized total reduction in the mathematical criteria is used. This value is known as the Gini Importance of the feature. Each feature is ordered in descending order according to the Gini Importance of each feature selection is performed and the user selects the top k features according to choice.
FIG (1) EXTRA TREE CLASSIFIER
Adaptive Boosting also called as Adaboost is a Machine Algorithm that comes under Ensemble learning. Boosting is an ensemble learning technique.AdaBoost is an iterative ensemble technique. It was proposed by Yoav Freund and Robert Schapire in 1996. It uses a set of Machine Learning algorithms to convert weak learners to strong learners to increase the accuracy of the model. For Adaboost we mostly use the decision tree algorithm as a weak learner. These weak learners take into account a single input feature and draw out a single split decision tree called the decision stump. While doing so, each observation is weighted equally while drawing out the first decision stump. The results from the first decision stump are analyzed, after which any observations that are wrongly classified are assigned higher weights. Subsequently, a new decision stump is drawn by considering the observations with higher weights as more significant. After this again if any observations are misclassified, theyre assigned a higher weight and this process continues until all the observations fall into the right class. Adaboost can be used for classification as well as regression-based problems, however, it is more commonly used for classification purposes.
FIG (2) ADABOOST CLASSIFIER
Linear Discriminant Analysis
Logistic regression is a linear classification algorithm with its applications in several scenarios. However, it has a few limitations that give rise to the need for alternate linear classification algorithms.
Logistic regression is usually used for two-class or binary classification problems. Though possible it is rarely extended for multiclass classification.
When the classes are well separated it can result in instability.
Few examples hamper the performance of Logistic Regression by making it unstable.
Linear Discriminant Analysis aims to address each of these three points. It is very well suited for multi-class classification problems.
FIG (3) RESULTS AFTER USING LDA ALGORITHM
This model makes use of Bayes Theorem to estimate the probabilities. It can be used to estimate the probability of the output class with the input using the probability of each class and the probability of data in concern belonging to each class:
In LDA a function estimating probability of x belonging to the class is used. In this approach, a Gaussian distribution function is vital. This is called a discriminate function.
Disk(x) = x * (muk/siga^2) (muk^2/(2*sigma^2)) + ln(PIk) From the input x, the muk, sigma^2 and PIk are all estimated from the dataset.
In this study, in order to solve the problem of detecting septic patients different classifiers of the algorithms have been studied to determine which algorithms classifier is the most feasible. Various parameters like Accuracy, Recall score also called as Model sensitivity, Precision score and f1-score are considered for evaluation for the same.These are calculated from the confusion matrix with help of the following parameters obtained from it such as True Positive(TP), True Negative(TN), False Positive(FP) and False Negative(FN).
The table that is often referred to describe the performance of a classification model on a set of test data on which the true values are known is called the confusion matrix. The general structure of the Confusion matrix is shown below.
FIG (4) STRUCTURE OF CONFUSION MATRIX
Here,the observations that are correctly predicted by the model are True Positive (TP) and True Negative(TN) values and hence are highlighted in green color.
True Positive(TP): The number of observations that are correctly predicted positive values. In short, the value of the actual class is yes and the value of the predicted class is also yes.
True Negative(TN): This term gives the number of observations that are correctly predicted negative values. In short, the value of actual class and predicted class is no.
False Positive(FP): This term gives the number of observations where the actual class is no and the predicted class is yes.
False Negative(FN): This term gives the number of observations where the actual class is yes but the predicted class is no.
Thus based on the above term parameters like Accuracy, Recall score, Precision score and F1 score are calculated.
Accuracy: The closeness of measurement results in the true value is called accuracy. It is the most intuitive performance measure and it is simply a ration of the correctly predicted observations to the total observations. It is given by the formula:
Accuracy = TP+TN/TP+FP+FN+TN
Precision: The ratio of correctly predicted positive observations to the total predicted positive observations is known as Precision. High precision values relate to the low false positive rate which is good. It is given by the formula: Precision = TP/TP+FP
Recall(Sensitivity): It is the ratio of correctly predicted positive observations to all the observations in actual class i.e yes. High recall relates to low false negative rate which is good.
F1-Score: It is also called an F-measure that is used to express the performance of the machine learning algorithm(or classifier). It also gives the combined idea of both Recall score and Precision score of the model. High F-measure indicates high value for both Recall and Precision score. Generally, F-measure is considered when comparison of two or more machine learning algorithms needs to perform. We always choose the algorithm or classifier with the highest f1 score.
Given below are the respective models with results obtained:
For Multi Layer Perceptron:
recall score: 0.18036529680365296
precision score: 0.42473118279569894
f1 score: 0.2532051282051282
test-set confusion matrix:[[9158 107] [ 359 79]]
For Naive Bayes:
recall score: 0.01914648212226067
precision score: 0.8924731182795699
f1 score: 0.037488708220415536
test-set confusion matrix:[[1013 20] [8504 166]]
For K-Nearest Neighbour:
recall score: 0.16133333333333333
precision score: 0.6505376344086021
f1 score: 0.2585470085470085
test-set confusion matrix:[[8888 65] [ 629 121]]
For Extra Tree Classifier:
recall score: 0.6777777777777778
precision score: 0.3279569892473118
f1 score: 0.4420289855072464
test-set confusion matrix:[[9488 125] [ 29 61]]
For Adaboost Classifier:
recall score: 0.11450381679389313
precision score: 0.4032258064516129
f1 score: 0.17835909631391203
test-set confusion matrix:[[8937 111] [ 580 75]]
For Linear Discriminant Analysis:
recall score: 0.04627151051625239
precision score: 0.6505376344086021
f1 score: 0.08639771510174937
test-set cnfusion matrix:[[7023 65] [2494 121]]
Thus, all the above parameters for each and every mentioned algorithm will contribute in evaluating the better performance of the algorithms on the given dataset.
In this study, four machine learning algorithms namely, Multi Layer Perceptron, K-Nearest Neighbor, Naive Bayes, Extra Tree Classifier, Adaboost Classifier and Linear Discriminant Analysis are created and tested on the balanced dataset obtained by Kfold Cross validation and SMOTE sampling. The performance evaluation of all these classifiers created are
measured based on the following performance metrics such as accuracy, recall score(sensitivity), precision score and F- measure.
Accuracy Evaluation of all the classifiers is shown below:
FIG (5) COMPARISON BASED ON ACCURACY
Here, all the classifiers are compared with each other based on the accuracy parameter as shown in the bar graph in Figure(5) . From above it is concluded that Extra Tree Classifier shows the highest accuracy followed by Multilayer Perceptron and then Adaboost classifier. Linear Discriminant Analysis Shows Average Accuracy while Guassian Naive bayes shows the lowest accuracy.
Recall (Model Sensitivity) evaluation :
FIG (6) COMPARISON BASED ON RECALL
Here, all the classifiers are compared with each other based on the recall i.e (Model Sensitivity) parameter as shown in the bar graph. From the above graph it is concluded that Extra tree Classifier shows the highest sensitivity of all classifiers. Apart from that MLP also shows good sensitivity but less than Extra tree classifier. Naive Bayes and Linear Discriminant Analysis show very poor results for sensitivity.
Precision score Evaluation for all the classifiers is shown below:
FIG (7) COMPARISON BASED ON PRECISION SCORE
Here, all the classifiers are compared with each other based on the Precision score parameter as shown in the Figure (7). From the above graph it is observed that Naive Bayes which performed lowest in accuracy and Recall score evaluation shows highest Precision score from all the classifiers. All the remaining classifiers show good performance here.
F1 score evaluation:
FIG (8) COMPARISON BASED ON F1 SCORE
Here, all the classifiers are compared with each other on the basis of F1-scores as shown in the bar graph in Figure(8). From above it is concluded that Extra Tree classifier shows the highest performance which also indicates the high recall and precision score. After Extra Tree Classifier, Multi Layer Perceptron shows good performance. Naive Bayes shows the least performance here.
The overall performance evaluation for all the Machine learning algorithms with same parameters metrics is shown below in Figure(9).
FIG (9) OVERALL PERFORMANCE EVALUATION FOR ALL CLASSIFIERS
CONCLUSION AND FUTURE SCOPE
Sepsis is when your body responds in a very different manner to an infection than it responds usually . During this time, the immune system of your body, which defends you from germs, releases a lot of chemicals into your blood. This triggers the widespread inflammation that can lead to organ damage. In severe cases, sepsis causes a dangerous drop in blood Pressure called septic shock. It can quickly lead to organ failure such as lungs, kidney & liver. This can be deadly. The main aim of this study is to identify the model that best identifies Septic patients. There are many ways to do the same. By conducting a comparative study upon various Machine Learning algorithms to get the best working Machine Learning model which detects maximum septic patients in the dataset used. The selection of best Model was established using different metrics such as recall, accuracy, precision and f1-scores.
From six classifiers the accuracy score of Extra Tree Classifier, Multilayer Perceptron and Adaboost Classifier is highest. So the results must be interpreted in consideration of some other metrics also. The other parameter to be considered is F1-score, which is a combination of both recall score and Precision. The classifier with the highest F1-score is considered as the best classifier for sepsis prediction along with accuracy parameters.
From Figure, the f1-score for Extra Tree Classifier is the highest as compared to other Machine Learning Algorithms. The Recall score i.e Model sensitivity for Extra Tree Classifier shows the highest results. The remaining parameter metric i.e Precision score also shows the not highest but good result for Extra Tree Classifier.
Therefore, Extra Tree Classifier shows good results on the balanced dataset where the dataset was divided into two separate data sets-one is Training data set and other is Testing data set. The training dataset was further subjected to K-Fold cross validation and data in every fold was balanced using SMOTE SAMPLING technique.
The training dataset was used to train the model for normal changes in the parameters. The testing dataset was used to verify the accuracy and effectiveness of the trained dataset.
Also, apart from this, Multilayer perceptron and Adaboost classifier can be used for sepsis prediction for large dataset with varying execution time according to the data variables. Further research should focus on using various algorithms for feature selection to get better prediction results and also
finding methods by which the patients would be categorized into different stages of sepsis according to sepsis 3 definition.
E. Gultepe,H. Nguyen,T. Albertson ,I.Tagkopoulos , A Bayesian network for early diagnosis of sepsis patients: a basis for a clinical decision support system, 2nd international conference on Computational advances in bio and medical sciences (ICCABS), 2012, pp. 1-5
Umut Kaya, AtnÃ§ Yilmaz, Yalm Dikmen, Prediction of Sepsis disease by Artificial Neural Networks, Journal of Selcuk-Technic Special Issue 2018 (ICENTE'18)
Jyoti Thakur, Sharvan Kumar Pahuja, Roop Pahuja, Neonatal Sepsis prediction model for resource-poor developing countries,
Brni, Mateo & ondri, Emanuel & Blaevi, Sepsis prediction using Artificial Intelligence algorithms, 2018
Manmay Nakhashi, Anoop Toffy, Achuth P V, Lingaselvan Palanichamy, Vikas C M, Early prediction of Sepsis: Using state- of-the-art Machine Learning techniques on vital sign inputs,
Computing in Cardiology Conference, 2019
Aruna Deogire, A low dimensional algorithm for detection of Sepsis from electronic medical record data, 2019
Lakshman Narayanaswamy, Devendra Garg, Bhargavi Narra, Ramkumar Narayanswamy, Machine Learning algorithmic and system level considerations for early prediction of Sepsis, 2019
Mengsha Fu, Jiabin Yuan, Chen Bei, Early Sepsis prediction in ICU trauma patients with using an improved cascade deep forest model, IEEE, 2019
Hsu, Po-Ya & Holtz, Chester, A Comparison of Machine Learning Tools for Early Prediction of Sepsis from ICU Data, 2019
Mehanas Shahul, Pushpalatha K, Machine Learning based analysis of Sepsis : Review, International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), 2020
Varsha Sharma, Chirayata Bhattacharyya, Tanuka Bhattacharjee, Sundeep Khandelwal, Murali Poduval, Anirban Dutta Choudhury, Sepsis prediction using continuous and categorical features on sporadic data, IEEE, 2020
Fleuren, L.M., Klausch, T.L.T., Zwager, C.L. Machine learning for the prediction of sepsis: A systematic review and meta-analysis of diagnostic test accuracy, 2020
Naman Bhandari, (2018 22 October), How does ExtraTreeClassifier reduce the risk of overfitting?,Medium, https://medium.com/@namanbhandari/extratreesclassifier- 8e7fc0502c7.
Belagere C., (2020 19 January), India has 2nd highest sepsis death rate in South Asia, The New India Express, https://www.newindianexpress.com/states/karnataka/2020/jan/19/ind ia-has-2nd-highest-sepsis-death-rate-in-south-asia-2091487
Mario R. Camana, Saeed Ahmed,.Carla E. Garcia, Insoo Koo. Extremely Randomized Trees-Based Scheme for Stealthy Cyber- Attack Detection in Smart Grid Networks, 2020.