Fetal Health Prediction using Classification Techniques

DOI : 10.17577/IJERTV10IS110182

Download Full-Text PDF Cite this Publication

Text Only Version

Fetal Health Prediction using Classification Techniques

Naveen Reddy Navuluri Department of Electronics and communication Maulana Azad National Institute of Technology

Bhopal, India

AbstractA fetal is basically an unborn offspring which is in the embryo stage until it comes to the world. During the pregnancy process, each three month period is known by a name called trimester. During this process the fetus grows and develops and along with it the regular checkups are very important. As we all know that a pregnancy lasts for 9 months and in this long period there may be various reasons which may cause disability or mortality in the newborn which is a very severe case and this needs to be avoided. One of the main tool to analyse the health of the fetal in the womb is by doing a CTG(Cardiotacagraphy) which generally is used to evaluate the heart beat and the uterine contractions hence the data generated is used by the doctor to analyse the health and give his wording. But there is a room for error hence the doctors are not reliable to analyse the data hence different machine and deep learning algorithms have been there which can analyse the data and predict the fetals health based on it. The main motive of the paper is to prove the prediction accuracy using the different classification models and compare which model performs better.

KeywordsComparative analysis, machine learning,Heart disease prediction, random forest, Classification


    Fetals are delivered from the womens womb and before that during the pregnancy the information about the fetus is tough to get. We can only get the information that there is a fetus and it would be delivered. So, as the information is not readily available the obstetricians who check the condition of the fetal rely on the indirect information. One of the most dependent information is about the fetal heart rate and there is a restriction of electronic fetal heart monitoring that this variation is hindering the records of the accurate communication and time management.There is an another component known as the cardiotocogram which contains distinct signals and is mainly used for recordings of the fetal heart rate which is the main way through which the obstetricians rely on the information. But the trend seen in these days by the doctors is that there is very high intra and inter observer fluctuation in FHR patterns. But there is a risk in which a falsely diagnosed fetal pain may lead to unnecessary interventions. Hence, the main motive of the research is to employ machine learning algorithms to classify the methods as there can be a room for error by the doctor but the prediction algorithm may perform really well in this case and also help in monitoring the results and give a proper analysis than the doctor could get by his own or someones observation.


    Some related work has been done in this following topic. In

    1. it says that the models lightbgm and bayesian were successful and also gave a small amount of performance gain to the prediction.An adaptive neuro-fuzzy inference system (ANFIS) [2] demonstrated its performance by predicting normal and diseased states from CTG data with

      97.2 percent and 96.6 percent accuracy, respectively.In [3] it is saying that comparing the classifiers, random forest and XGBoost are performing well but the dataset used is imbalanced as some modified version is to be used to get the better output in terms of accuracy.In [4] the dataset that has been is used is the CTG data which is observed to be beneficial to identify the abnormalities. The visual analysis along which the decision support system focuses has been made on the machine learning models that have been used. As the machine learning doesnt perform well on the basis of accuracy the ensemble model has been used which has bagged an accuracy of 99.02% after the 10-fold cross validation has been employed. Hence, this can be used to classify the normal and the pathological cases of the ctg data.While in [5] Artificial Neural Network(ANN) along with simple logistic models are used and base on the 10 fold cross validation which has been used to test the data the best results that they have been obtained are 98.47 with the Artificial Neural Network(ANN) model and 98.74 with the Logistic model and hence logistic model has performed better by a minute difference in the accuracy than the Artificial Neural Network(ANN).



      The data came from the Fetal Health stroke dataset, which has been utilised in various research. Because the data is scarce, the only way to run the model and produce a forecast was to take data from a trusted source. As a consequence, the Fetal dataset was used to create the dataset. This dataset is used to predict foetal health by utilising multiple characteristics such as baseline value,accelerations,fetal movement,uterine contraction,light decelerations, and so on. For Machine Learning and Data Visualization applications, the filtering approach is used to choose a subset of the original train data.The glimpse of the features can be seen in figure 1.

      Figure 1: Dataset Overview


      A dataset is made up of patterns or entities. A set of characteristics that characterise data items captures an item's basic features, such as the mass of a physical object or the time at which an event happened. So, in order to start improving the dataset quality, the first step would be to remove the null values which probe a problem while retrieving the accuracy of the dataset.After that the description of the dataset is required just to get an idea on it and to get a head start for the feature engineering process which can be seen in the figure 2.

      Figure 2: Dataset Description


      Feature engineering plays an important role as everything from the data to the output of the same is dependent on the feature engineering which is being performed. Firstly a pie chart shown in figure 3 is being visualised which shows about the different health types of the fetal and get to know if it is a significant feature or not. While in figure 4 we are getting a correlation matrix of all the features to their significance. The correlation matrix is used in getting the relation of each feature or attribute with itself and also with the other features present out in the database.

      Figure 3: Fetal Health pie chart

      Figure 4: Correlation matrix


      1. RANDOM FOREST CLASSIFIER Random Forest Algorithm is a supervised learning algorithm which collects samples from different data sets and predicts the best solution. It is forming a Decision Tree like Structure. And Its more accurate than the Decision Tree. But quite slower in prediction and complex in constructing.


        In the Support Vector Machine algorithm the data sets are divided into different parts as support vectors by a Margin and between them there is a space which is called Hyperplane. It is a widely used algorithm and has many direct use cases out there.


        The Naïve Bayes algorithm is based on Bayes Theorem and is used to solve Probabilistic ML Problems to predict the class of unknown data sets. It is one of the fastest and simplest machine learning algorithms for problem solving.

      4. LOGISTIC REGRESSION In Logistic Regression there are only 2 outcomes.

    For Example: True or False, 1 Or 0 .Logistic Regression uses one or more Independen Variables to determine an outcome hence the following feature makes the algorithm faster and better to perform


As we could see the health feature is the most important attribute and keeping that in mind from the feature engineering results. Keeping that in mind the different classification algorithms are performed such as logistic regression, random forest, svm and naive bayes. After all these models have provided us with the accuracy the authors have come to the results that logistic regression gives about

99.5 percent accuracy while random forest gives 98 percent

, svm gives 96.5 percent and Naive bayes gives 97 percent. Hence, after seeing all the accuracy one can know that logistic regression is performing the best.

Figure 5: Accuracy of all models


    Finally, after performing all the steps needed to get the results from preparation to preprocessing to feature engineering and finally performing the models( SVM, random forest, logistic regression and naive bayes) the authors have concluded that the model which performs the best out of all these is the logistic regression model with 99.5 percent accuracy


    As there is a lot of possibility of improvement in this based on the data as modern real time data can be collected which can be used to test all the different models that are present and to create a new accuracy based on this. Another thing that can be done is to test the model made by the authors and also create a comparison on the new data that is there. The data collection would take a long time hence till then multiple times the data should be collected from different sources.


[1] medRxiv 2021.06.03.21255808; doi:


  1. Ocak, Hasan, and Huseyin Metin Ertunc. Prediction of fetal state fromthe cardiotocogram recordings using adaptive neuro-fuzzy inferencesystems. Neural Computing and Applications 23.6 (2013): 1583-1589.

  2. Piri, Jayashree & Mohapatra, Puspanjali. (2019). Exploring Fetal Health Status Using an Association Based Classification Approach. 166-171. 10.1109/ICIT48102.2019.00036.

  3. ahin, Hakan & Subasi, Abdulhamit. (2012). Classification of Fetal State from the Cardiotocogram Recordings using ANN and Simple Logistic.

  4. M. Ramla, S. Sangeetha and S. Nickolas, "Fetal Health State Monitoring Using Decision Tree Classifier from Cardiotocography Measurements," 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS), 2018, pp. 1799-1803, doi: 10.1109/ICCONS.2018.8663047.

  5. A. H. Khandoker, M. Wahbah, R. Al Sakaji, K. Funamoto, A. Krishnan and Y. Kimura, "Estimating Fetal Age by Fetal Maternal Heart Rate Coupling Parameters," 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2020, pp. 604-607, doi: 10.1109/EMBC44109.2020.9176049.

  6. J. Kolarik, L. Soustek and R. Martinek, "Examination and Optimization of the Fetal Heart Rate Monitor : Evaluation of the effect influencing the measuring system of the Fetal Heart Rate Monitor," 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom), 2018, pp. 1-4, doi: 10.1109/HealthCom.2018.8531168.

  7. M. Wahbah, R. Al Sakaji, K. Funamoto, A. Krishnan, Y. Kimura and

    1. H. Khandoker, "Estimating Gestational Age From Maternal-Fetal Heart Rate Coupling Parameters," in IEEE Access, vol. 9, pp. 65369- 65379, 2021, doi: 10.1109/ACCESS.2021.3074550.

  8. S. Modi and M. H. Bohara, "Facial Emotion Recognition using Convolution Neural Network," 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), 2021, pp. 1339-1344, doi: 10.1109/ICICCS51141.2021.9432156.

  9. S. Mazumdar, R. Choudhary and A. Swetapadma, "An innovative method for fetal health monitoring based on artificial neural network using cardiotocography measurements," 2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), 2017, pp. 265-268, doi: 10.1109/ICRCICN.2017.8234518.

  10. D. D. Fong et al., "Design and In Vivo Evaluation of a Non-Invasive Transabdominal Fetal Pulse Oximeter," in IEEE Transactions on Biomedical Engineering, vol. 68, no. 1, pp. 256-266, Jan. 2021, doi: 10.1109/TBME.2020.3000977.

  11. V. Shah and S. Modi, "Comparative Analysis of Psychometric Prediction System," 2021 Smart Technologies, Communication and Robotics (STCR), 2021, pp. 1-5, doi: 10.1109/STCR51658.2021.9588950.

  12. M. B. I. Reaz and Lee Sze Wei, "An approach of neural network based fetal ECG extraction," Proceedings. 6th International Workshop on Enterprise Networking and Computing in Healthcare Industry – Healthcom 2004 (IEEE Cat. No.04EX842), 2004, pp. 57-60, doi: 10.1109/HEALTH.2004.1324471.

  13. L. H. Lee and J. A. Noble, "Automatic Determination of the Fetal Cardiac Cycle in Ultrasound Using Spatio-Temporal Neural Networks," 2020 IEEE 17th International Symposium on Biomedical

    Imaging (ISBI), 2020, pp. 1937-1940, doi: 10.1109/ISBI45749.2020.9098613.

  14. P. Dwivedi, A. A. Khan, S. Mugde and G. Sharma, "Diagnosing the major contributing factors in the classification of the fetal health status using cardiotocography measurements: An AutoML and XAI approach," 2021 13th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), 2021, pp. 1-6, doi: 10.1109/ECAI52376.2021.9515033.

  15. J. Piri and P. Mohapatra, "Exploring Fetal Health Status Using an Association Based Classification Approach," 2019 International Conference on Information Technology (ICIT), 2019, pp. 166-171, doi: 10.1109/ICIT48102.2019.00036

Leave a Reply