Comparative Analysis of Classifiers for Polycystic Ovary Syndrome Detection using Various Statistical Measures

Polycystic Ovary Syndrome (PCOS) is a condition that affects girl or women during their child-bearing years and disturbs the levels of hormones. This disturbance results in problems affecting many body systems. Women having PCOS have skip or irregularity in menstrual periods as well as cysts formation in the either or both ovaries. Symptoms of PCOS are irregular periods, excess androgen, polycystic ovaries, abnormal BMI, disturbed levels of hormones (LH, FSH, DHEAS), poor insulin resistance. But as per research studies, these symptoms are not sufficient for accurate detection for diverse data. This article presents an approach where classification of PCOS will use physical symptoms and sonograms. The results of only physical symptoms are presented here. The sonogram analysis along with the physical symptoms of PCOS are needed for accurate detection and reducing number of outliers during analysis. Such detection of PCOS also helps in proper treatment and reducing the health loss. The performance analysis of various Machine learning algorithms like Multilayer Perceptron, K-star, IB1 instance-based, Locally weighted learning, Decision Table, M5 rules, Zero R, Random Forest and Random Tree to classify PCOS is presented. Amongst all the algorithms K-star algorithm is out performing in all the performance measure. Keywords— Classification, Machine Learning, Polycystic Ovary Syndrome, performance measures, sonography


INTRODUCTION
There are many disorders related to women reproductive system which may lead to some serious health issues in future. These disorders are related to the ovaries, uterus, cervix, the vagina, etc. the cause behind occurring of these diseases is hormonal changes inside the body, hormonal imbalance, irregular living patterns, stress, etc. Polycystic ovary syndrome (PCOS) comes in the category of hyperandrogenism i.e. excess androgen production by ovaries. It is a disorder commonly found in reproductive age group(15-40 yrs). The age group is fixed as before 15 age is when the menstruation begins; so there is a huge possibility that the menstrual periods are irregular and after 40 age the menopause periods of women begins. In this condition the women's hormone levels are affected [1]. This hormonal imbalance leads to cysts formation on the outer edge of the ovaries. The cysts are like follicle or small ball of tissues which is found in both or either ovary of PCOS women. These cysts are small and harmless. The size as well as number of these cysts is not fixed. They can vary from 2mm-9mm in size [2]. PCOS further leads to infertility in women. Infertility causes due to infrequent ovulation that is not able to release a mature egg from the ovary. This infertility affects conceive rate for a women to get pregnant. According to recent study, it is found that 18% of female in East India suffer from PCOS [3]. Women with PCOS produce higher-than-normal amount of androgen. This imbalance in androgen causes skip or irregularity in menstrual periods. It also causes hirsutism and excessive acne formation [4]. PCOS can lead to long-term health problems like diabetes and heart disease. Fig. 1[5] shows the cysts formation in the ovary of a woman having PCOS. Commonly found symptoms of PCOS are: Irregularity or missed menstrual periods, Excessive Hair growth on face and unwanted body parts known as Hirsutism, Acne formation and oilier skin due to high androgen levels, abnormal Body Mass Index leading to obesity. Along with the mentioned physical symptoms; there is a need to conduct a blood test for checking the hormone levels in body.
Hormone tests include increase in levels of Luteinizing hormone(LH), Follicle-stimulating hormone(FSH), Dehydroepiandrosterone(DHEAS), Fasting blood sugar and Fasting insulin. Ratio of fasting blood sugar and fasting insulin if decreased leads to poor insulin resistance [6].
The exact cause of PCOS is not known. Some of the reasons may be: genetics, insulin resistance that leads to high testosterone, hormone imbalance that imposes negative effects on whole body [7].
There are various tests that needs to be conducted to diagnose PCOS. To diagnose PCOS and find other causes of your symptoms, a doctor may ask you about your previous medical history and do a physical exam, Pelvic ultrasound test and some blood tests The objective was to characterize early endocrine and metabolic changes in mid-aged women with PCOS and to determine whether the differences between non-obese and obese women are detected early. Comparison between obese PCOS and non obese PCOS done with F test. Jacob P. Christ, Heidi Vanden Brink, Eric D. Brooks, Roger Pierson, Donna R. Chizen and Marla E. Lujan, "Ultrasound features of polycystic ovaries relate to degree of reproductive and metabolic disturbance in polycystic ovary syndrome [13]", 49 women (aged between 19 to 36) diagnosed with PCOS were chosen. Evaluation of menstrual cycle and also physical exam assess various parameteres( height, weight , BMI, blood pressure,etc) was performed. Study of Antral follicle count(AFC), number of follicles per follicle size, ovarian volume(OV), stromal area(SA), ovarian area(OA), stromal to ovarian area(S/A), stromal index(SI) is performed. Spearman rank was used for correlation between different parameters.

A. Work Flow of Proposed Approach
The patients with metabolic (physical) symptoms like acne, facial hair growth and irregular period's needs can be examined in daily routine. But, these metabolic symptoms alone are not sufficient to diagnose the PCOS. Therefore the hormone tests like LH(Luteinizing hormone), FSH(Folliclestimulating hormone), androgen level, DHEAS, fasting insulin, fasting blood sugar should be examined. The physical as well as hormonal symptoms will be considered as a feature set for the proposed system. These features will be statistically analyzed with machine learning algorithms. Here, aim is to select efficient algorithm for feature dataset. The workflow for PCOS detection is represented in Fig. 2. The mathematical formulation of proposed system also needs the parameters like age, height, weight, fasting insulin, fasting blood sugar, sonography along with above mentioned parameters.

B. Dataset used
The dataset for proposed system is not readily found on available repositories. Therefore, dataset is created in discussion with medical practitioner with their expertise in PCOS detection. . The dataset generated has 13 attributes and 2 classes. Total 40 instances are created. These attributes are the symptoms related to PCOS such as physical symptoms(age, height, weight, irregular periods, hirsutism,

IJERTV9IS030404
(This work is licensed under a Creative Commons Attribution 4.0 International License.) www.ijert.org acne), and blood test results; (LH(Luteinizing hormone), FSH(Follicle-stimulating hormone), androgen level, DHEAS, fasting insulin, fasting blood sugar) and clinical test(sonography). The class type specifies the presence of PCOS or not. The most important symptoms (that are influencing factors) as highlighted are weight, irregular periods, acne, LH and sonography. The type attribute tells about the prediction whether the women has PCOS or not. The dataset is validated based on various cases of PCOS patients and opinion of expert from medical domain.

C. Results and discussion
The experimentation is carried out on the dataset created and well known machine learning algorithms. The objective of using various algorithms is to identify the most suitable algorithms for classification of the dataset created. The Machine Learning algorithms like Multilayer Perceptron, K star, IB1 instance-based, Locally weighted learning, Decision From the results, it is observed that in all statistical parameters, K star algorithm super sits the other algorithms, giving good classification accuracy. Other algorithms may outperform with the increase in dataset size. This analysis will be carried out further for the said research.

IV. CONCLUSION
In this paper, the Classification algorithms; Multilayer Perceptron, K star, IB1 instance-based, Locally weighted learning, Decision Table, M5 rules, Zero R, Random Forest and Random Tree algorithm are used to detect whether the patient have PCOS or not. Classification techniques are considered in this study as it enables us to predict if the patient has Polycystic Ovarian Syndrome or not based on the syndromes provided by the doctor or medical Centre. The machine learning model is developed using real time data. It has been noticed that the Root Mean Squared Error of the Kstar algorithm is lowest as compared to other algorithms. These models can provide help to the doctors to recognize the disease much faster, therefore early treatment can be given to the patient.