 Open Access
 Total Downloads : 1061
 Authors : Ravi K Jade, L. K. Verma, Kesari Verma
 Paper ID : IJERTV2IS2467
 Volume & Issue : Volume 02, Issue 02 (February 2013)
 Published (First Online): 28022013
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Intelligent Data Mining Techniques For Coal Mining Data
Ravi K Jade#1 , L. K. Verma*2, Kesari Verma#3
#1Department of Mining Engineering National Institute of Technology Raipur
*2Department of Computer Applications Raipur Institute of Technology Raipur
#3 Department of Computer Applications National Institute of Technology Raipur
Abstract
Classification is an important problem in data mining. Given a database of records, each with a class label, a classifier generates a concise and meaningful description for each class that can be used to classify future records whose classes are unknown. A number of popular classifier exists like naÃ¯ve bays classifier, Neural Network, SVM classifier, CARD etc. In this paper we applied classification algorithm for coal mines dataset. We also discussed decision tree, nearest neighbor classifier, NN classifier for prediction of class label.
Keywords: Data Mining algorithm, C4.5 algorithm, NaÃ¯ve Bayes Algorithm, Intelligent Data mining for coal data
Category and Subject Descriptors:
H.2.8 [Information System] : Database Management – database application, data Mining.

Introduction
Classification is an important problem in data mining. Under the guise of supervised learning, classification has been studied extensively by the machine learning community as a possible solution to the knowledge acquisition or knowledge extraction problem. The input to the classifier construction is a training set of records, each of which is tagged with a class label. A set of attribute values defines each record. Attributes with discrete domains are referred to as categorical, while those with ordered domains are referred to as numeric. The goal is to induce a concise model or description for each class in terms of the attributes. The model is then used by the classifier to
classify (i.e., assign class labels to) future records whose classes are unknown [10]. It thus, reduces outlier problem. The classification model of dataset is shown in figure 1.
Fig . Data Mining process
Coal mine dataset is shown in Table I that comprises of Spacing(M), Burden(M), Depth (M), Stemming(M). Second part Types of Explosive that consist of Booster(Kg), Primer(Kg), Boulders (Nos.).
TABLE I.
COAL MINE DATASET
Spac
Burd.
Depth
Stemm
Boost.
Prim.
SME
Bould ers (Nos.)
5
3.25
6.25
3
10.5
0
4600
1340
5.75
3.5
7
3.5
15.3
294
7500
420
5.75
3.5
7
3.5
10.5
0
5700
1200
6.25
4
8.25
3.5
35.2
0
15000
625
5
3.25
6.25
4.0
13.2
0
9000
1075

Data Acquisition process
The data is acquired from coal mines where the blasting process for mines are performed. The spacing(M) denotes the space between holes. The class label Boulder denotes after the blasting the size of boulder produced in blasting.

Data Preprocessing
Data preprocessing is an often neglected but important step in the data mining process. The phrase Garbage in, Garbage Out is applicable in data mining and machine learning. Data gathering methods are loosely controlled, resulting in out of range values, impossible data combination, missing values. Analyzing data that has not been carefully screen for such problems can produce misleading results. The representation and quality of data is first and foremost before running an analysis. In our dataset we filled the missing value by mean of whole data set of specified attributes. Some data values the range is given, we computed the mean value and it is replaced by mean of the data e.g. for stemming(M ) the range is given 3.5 to3.75 that data is replace by 3.625.

Data Transformation
In data transformation, the data are transformed or consolidated into forms appropriate for mining. Data transformation process are follows.
Normalization where the attribute data are scaled in the range such as 1.0 to +1.0.

Feature Selection

The main idea of feature selection is to choose a subset of input variables by eliminating features with little or no predictive information. Feature selection [2] can significantly improve the comprehensibility of the resulting classifier models and often build a model that generalizes better to unseen points. In machine learning and statistics, feature selection [2] also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features for use in model construction. The central assumption when using a feature selection technique is that the data contains many redundant or irrelevant features. Redundant features are those which provide no more information than the currently selected features, and irrelevant features provide no useful information in any context.

Classification
There are various classifier to classify the data like NN, SVM, Decision tree and others.

Nearest Neighbour
Nearest neighbour (NN) [11] also known as Closest Point Search is a mechanism that is used to identify the unknown data point based on the nearest neighbour whose value is already known. It has got a wide variety of applications in various fields such as Pattern recognition, mage databases, Internet marketing, Cluster analysis etc.
The NN algorithm can also be adapted for use in estimating continuous variables. One such implementation uses an inverse distance weighted average of the knearest multivariate neighbors. This algorithm functions as follows:

Compute Euclidean or Mahalanobis distance from target plot to those that were sampled.

Order samples taking for account calculated distances.

Choose heuristically optimal k nearest neighbor based on RMSE done by cross validation technique.


Decision Tree
Decision Tree Classifier [4]A decision tree is a class discriminator that recursively partitions the training set until each partition consist entirely or dominantly of examples from one class. Each leaf node of the tree contains one or more attributes and determines how the data is partitioned.
A decision tree classifier is built in two phases [4] growing phase followed by pruning phase. In growth phase the tree is built by recursively partitioning the
data until each partition is either pure or sufficiently small.
The algorithm for building tree [5]. Procedure buildTree (S)

Initialize root node using dataset S

Initialize queue Q to contain root node

While Q is not empty do {

dequeue the first node N in Q

if N is not pure {

for each attribute A

Evaluate splts on attributes A

Use best split to split node N into N1 and N2

Append N1 and N2 to Q
10) }
11) }


Bayesian Network
Bayesian network (BN) is also called belief networks. A BN is a graphical representation of probability distribution. It belongs to the family of probabilistic graphical models .This BN consist of two components. First component is mainly a directed acyclic graph (DAG) in which the nodes in the graph are called the random variables and the edges between the nodes or random variables represents the probabilistic dependencies among the corresponding random variables. Second component is a set of parameters that describe the conditional probability of each variable given its parents. The conditional dependencies in the graph are estimated by statistical and computational methods [6], [7].
This prior expertise about the structure of Bayesian network algorithm work as follows.

Declare that a node is root node.

Declare that a node is leaf node.

Declaring that a node has direct effect of another noe.

Declaring that a node is not directly connected to another node.

Declaring that two nodes are independent, giving a condition set.

Providing partial ordering among the nodes.


Artificial Neural Network
Artificial Neural Network (ANN) is a computational model based on biological neural network. ANN also called Neural Network [ANN]. It contains interconnected group of artificial neurons and processes the information by a connectionist approach. ANN is an adaptive system because it changes its structure based on information flow during the learning phase.
Fig. 2. Architecture of ANN
Actual algorithm for a 3layer network (only one hidden layer) [8]:
Initialize the weights in the network (often randomly) Do
For each example e in the training set
O = neuralnetoutput(network, e) ; forward pass t = teacher output for e

Calculate output ,

Calculate error (T – O) at the output units

Compute delta_wh for all weights from hidden layer to output layer ; backward pass

Compute delta_wi for all weights from input layer to hidden layer ; backward pass continued.

Update the weights in the network
The different classification techniques and their comparative charts is shown in Table 1.
TABLE I.
DIFFERENT CLASSIFICATION TECHNIQUES
Method
Generative
/Discriminat ive
Loss Functions
Parameter estimation Algorithm
KNearest Neighbour
Discriminative
log P(X,Y) or Zero one
All data are unsuperised
Decision tree
Discriminative
Zero One loss
C4.5
Bayesian Network
Generative
log O(X,Y)
Variable Elimination
Neural Network
Discriminative
SumSquared Error
Forward Propagation



Experimental Result
The experiments were perform in weka machine learning software [12] in pentium IV machine with 1 GB RAM. No other application are running while performance computation. The dataset contain 47 instances, 6 attributes i.e. Number_of_holes, Burden, Depth, Primer, SME and Class. In test mode:10fold crossvalidation are perform. All the numeric data is converted to categorical data all <50 values denotes Effective Boulders and remaining are converted as NotEffective Boulders. The data is taken from Jan December 2009. The experimental results their accuracy shown in Table II,III,IV and V.
Table II.
Result with Decision Tree Model
In this paper we applied data mining process in coal mining data. The mining process start with data preparation is an important issue for data mining, as real world data tends to be incomplete, noisy and inconsistent. Data preparation includes data cleaning, data integration, data transformation and data reduction. In order to mine the effective size of Boulders (Nos.) the attribute effective are analysed. Decision tree model gives 87% of accuracy to correctly predict the class label where as Neural Network model predict 84% correct class label. In future we try to implement support vector machine classifier in order to improve the accuracy of classifier.

References
Correctly 
41 
87.234 % 
[1] YongSeog Kim, W. Nick Street, and Filippo Menczer. 
Classified Instances 
Feature Selection in Data Mining. 

Incorrectly 
6 
12.766 % 
[2]. http://en.wikipedia.org/wiki/Feature_selection 
Classified Instances 
[3]. Ms. Aparna Raj, Mrs. Bincy G,, Mrs. T.Mathu. Surveyon  
Mean absolute error Root mean squared 
0.2365 Common Data Mining Classification Techniques. 0.337 April 2012 

error 
Morgan Kaufman, 1993. [4] J. Ross Quinlan. C4.5: Programs 
Correctly 
41 
87.234 % 
[1] YongSeog Kim, W. Nick Street, and Filippo Menczer. 
Classified Instances 
Feature Selection in Data Mining. 

Incorrectly 
6 
12.766 % 
[2]. http://en.wikipedia.org/wiki/Feature_selection 
Classified Instances 
[3]. Ms. Aparna Raj, Mrs. Bincy G,, Mrs. T.Mathu. Surveyon  
Mean absolute error Root mean squared 
0.2365 Common Data Mining Classification Techniques. 0.337 April 2012 

error 
Morgan Kaufman, 1993. [4] J. Ross Quinlan. C4.5: Programs 
International Journal of Wisdom Based Computing, Vol. 2(1),

J. Ross Quinlan. C4.5: Programs for Machine Learning.
Table III
Confusion Matrix for Decision tree
Effective
Effective 0
Not Effective 6
Not
0
41
Effective
Table IV
Result with Perceptron NN Model
for Machine Learning. Morgan Kaufman, 1993.

Keshri Verma, O.P. Vyas. An Approach for Decision tree classifier for Temporal data, Science Journal of Pt. Ravishankar Shukla University Raipur.,2004 ,ISSN NO 0170 5910 Vol. 18, No B (Science) 2005, pp 0722.

Charniak, E. 1991, .Bayesian Networks without tears. AI Magazine, Winter 1991.
Correctly 
40 
85.1064 
[9] http://en.wikipedia.org/wiki/Backpropagation 
Classified Instances 
[10] Rajeev and Kyuseok Shim. PUBLIC: A Decision Tree  
Incorrectly 
7 
14.8936 
Classifier that Integrates Building and Pruning". In 
Correctly 
40 
85.1064 
[9] http://en.wikipedia.org/wiki/Backpropagation 
lassified Instances 
[10] Rajeev and Kyuseok Shim. PUBLIC: A Decision Tree  
Incorrectly 
7 
14.8936 
Classifier that Integrates Building and Pruning". In 
[8] Articial Neural Networks Ajith Abraham Oklahoma State University, Stillwater, OK, USA.
Classified Instances Mean absolute error
Root mean squared error
Table V
0.1568
0.345
Proceedings of the 24th International Conference on Very Large Data Bases, pages 404{415,New York, USA, August 1998.

A. Djouadi and E. Bouktached, A Fast Algorithm for the Nearest Neighbor Classifier, IEEE Transaction Pattern Analysis and Machine Inteeligence, Vol. 19 no. 3 pp. 277 282,1997.
Confusion Matrix
Effective Not Effective
Effective 0 4

http://www.cs.waikato.ac.nz/ml/weka/
Not Effective
5. Conclusion
3 38