Future Perception In Public Health Care Using Data Mining
Dr. Surendra Kumar Yadav
Associate professor, JECRC University, Jaipur
Nitesh Dugar
Research scholar in Software Engineering, JECRC, Jaipur
Aditi Jain
Research scholar in Software Engineering, JECRC, Jaipur
Healthcare information is diverse in scope and huge in content and its volume is so vast that traditional / routine analytical methods reveal very little of the possible conclusions. Modern data mining techniques can be applied to this data to extract otherwise hidden
/ unknown facets of knowledge which may be of vital importance to therapeutic, commercial and preventive aspects of healthcare. This research paper provides a survey of mining concepts in health-care, necessity of data-mining in Medicare field, algorithms used and its applications in various health care domains.
Data mining is the explo ration of large datasets to extract hidden and previously unknown patterns, relationships and knowledge that are difficult to detect with tradit ional statistics [1]. So it extracts helpful knowledge fro m huge datasets and presents it in human understandable form. Data-min ing divide its task into two parts Predictive and Descriptive tasks. Predictive tasks predict the value of specific attribute and descriptive tasks drive patterns that summarize the relation between data. Classification, Regression and Deviation Deduction come under Predictive Tasks. Descriptive Tasks derive pattern that summarize the relationship between data [2]. Data min ing expert ise provides a consumer leaning approach to new and unknown pattern in the data [3].
With the application of data mining tools in spreadsheet of the program that analyzes data to identify patterns and relations, user profiling and development of business strategies can be started [4]. Health-Care data min ing involves few steps fro m collecting patients raw data and discovering new knowledge out of it.
To discover knowledge out of data there is an iterative
process consists of data cleaning, data integration, data selection, data transformat ion, data mining, pattern evaluation, and knowledge representation. The Fig.1 shows how Data Min ing extracts Knowledge fro m a given Dataset.
Figure 1. Knowledge discovery process
In recent years, a medical department has been frustrated by the problems of overloading, long processing time, delay patient treat ment and high cost. These problems have been caused from several internal and external factors, including patient characteristics, staffing patterns of medical department, access to health care providers, patient arrival time, management practices, and testing and treatment strategies selected by medical depart ment [5].The first step is to understand these issues, to improve the patient care in hospital departments. Today most of the hospitals have Hospital Informat ion Systems (HIS) to manage patient data. These systems generate huge amount of data in the form of image, charts, texts, and numbers. The researchers in the med ical field identify and predict the disease with the aid of data mining techniques [6].
Basic Mining Concepts in Health-Care
In medical research, data mining starts with a hypothesis and then the results are adjusted to fit the hypothesis [7].Hence, data mining describe patterns, but do not explain trends in pattern.
Necessity of Data Mining in Health-Care
With the help of data min ing we can improve the public health, the health care of the system users, reduce costs, save time and money [4].Fo r preparing Health Care Informat ion System (HM IS) reports for usage of hospital capacities such as number of occupied and vacant beds, number of patient vs doctors and nurses etc. Combin ing data mining and geographical informat ion system (GIS) we can discover disease clusters towards specific location which leads to better policy making to detect and manage disease outbreaks.
Knowledge discovery (KDD) and data min ing Co mbined to discover fraud in cred it cards and insurance claim fro m a huge database. Therefore data mining allo ws organizat ions and institutions to get more out of existing data at minimal ext ra cost [7]. Adverse drug events, some drugs and chemicals that have been approved as non-harmful to humans are
later discovered to have harmful effects after long term publicize, but data min ing algorith ms such as Multi-item Gamma Poisson Shrinker (M GPS) can discover drug side effects[8]. The basic objective of data min ing in healthcare is decrease mortality rate, early detection of diseases, to help in diagnosis and detecting fraudulent health claims.
Basic Health-care Data Mining Tasks
Initially Data is extracted, transform and load fro m the transaction database into Data-Warehouse as shown in fig2. After extracting data into electronic mach ine following tasks will be performed
Figure 2. B asic concept of data mini ng Health-Care B usiness Understanding:
This Phase concentrates on understanding health-care project objective and various requirements.
Medical-Data Understandi ng:
This phase starts with collection of health care data fro m Laboratories, Operation Theater, Blood Ban k, drug store, Therapy Modules etc, and also focuses on understanding of patients data to discover knowledge out of it to generate HMIS repots.
Medical-Data Preparati on:
This phase constructs final data set to feed into modelling tools and it is iterative process. Here various database artifacts such as attribute, table, records are selected as well as transformed and cleaned for modelling tools.
This phase apply various modelling techniques such as Naive Bays, Artificial neural network, decision tree, time series algorithm, clustering algorithm, sequence clustering algorith m etc to generate optimal values.
Evaluati on:
In this stage thorough evaluation and reviewing the model to check whether applied algorithms discovers proper hidden pattern. And this stage also checks for fast accessing on mined data.
In this phase most of the time customer carry out the deployment wizard with the help of analyst to generate HMIS reports. For generating useful knowledge out of data i.e. reports we can repeat data mining process.
Figure 3. Heal th care business cycle
Common Data Mining Algorithm in Health-care Domain
Decision Tree Algorithm
Decision tree is a flow chart like tree structure [9] which represents rules, where each node denotes a test on an attribute value, branch denotes test output, and leaves represent classes [10]. It recursively partitions a
data set of records using depth-first greedy approach until all the data items belong to a particular class are identified[11]. There are two phases in decision tree algorith m: tree building and tree pruning. Tree building uses top down approach for partitioning the data sets and bottom up approach is used in tree pruning to imp rove prediction and classificat ion accuracy of the algorith m by min imizing the over fit[12]. It is easy to extract display rules, has smaller computation amount and could display important decision property and own higher classificat ion precision [13]. It can be used in predictive modeling for both discrete and continuous attributes.
Physiologic Data Signals [21]
To Examine Mental Health Care [22]
Detect Patterns
Adverse Drug React ions [14] Breast Feeding [15]
Clin ical Pract ice In Psychiatry [16]
Diagnosis Of Myocardial Infarction [17] Literature Review Of meta-analysis[18] Measles Detection [19]
Orthopaedic (Fracture Data) [20]
Artificial Neural Network
Neural network is a parallel processing network which is based on biological neural netwo rk and features of neurons (computing elements), generated with simulat ing the image intuitive thinking of humans[23]. Neuron receives a number of input signals and performs a simple operation on this set of inputs. The output of each neuron is fanned out to the input of other neuron [24] It can be used to model complex relationships between inputs and outputs or to find patterns in data [25]. It uses the idea of non- linear mapping, parallel p rocessing and neural network structure to express the associated knowledge of inputs and outputs [26]. Two approaches of using neural network in data min ing are self organization neural network, which is learning without teachers and fuzzy neural network is used to increase output systems capacity and to make system more stable. Neural networks can be used for both supervised and
unsupervised learning applications. Neural networks have high acceptance ability for noisy data and high accuracy and are preferable in data mining [27].
Analysis of side drug effects [29]
Breast cancer [30]
Cerv ical cancer [31]
Classification of Blood Cells [32]
Health-Care image processing [33]
Statistical Models for Breast Cancer [34]
To analysis ECG Waves Forms [35]
To Classify Retina damage [36]
Tumours [37]
Genetic Algorithm
Genetic algorithm performs global search based on greedy approach and performs with less time complexity. It is an adaptive heuristic search algorith m premised on the evolutionary ideas of natural selection and genetic [38]. It uses a number of artificial individuals looking through a complex search space by using functions of selection, crossover and mutation [39]. This algorithm creates a number of random solutions out of which only few solutions are optimal. Other poor solutions are discarded. Good So lutions are then hybridized and same process will be repeated again to get only best solutions. All best solutions are then combine together to obtain a universal solution.
Nearest Neighbor method
K nearest neighbor algorith m is used for classification in wh ich a training data set is given. This algorith m performs search operation on training data of k nearest neighbor of the given input within a certain distance and then finds for the best fit class for new train ing data. It can be a costlier algorith m when training data is extremely large because whole data set is scanned to find out the best nearest neighbor. There is no learning process to create a model for this
algorith m. The data used in algorithm is itself a model [4].
