 Open Access
 Total Downloads : 333
 Authors : Yasobanta Karali, D Kodamasingh, R B Lyngdoh H.S. Behera
 Paper ID : IJERTV2IS100105
 Volume & Issue : Volume 02, Issue 10 (October 2013)
 Published (First Online): 05102013
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Hard and Fuzzy Clustering Algorithms Using Normal Distribution of Data Points: a Comparative Performance Analysis
Yasobanta Karali,D Kodamasingh,R B Lyngdoh H.S. Behera, Department of Computer Science and Engineering,
Veer Surendra Sai University of Technology(VSSUT), Burla Sambalpur, Orissa, India
Abstract: With its huge applications, Data Clustering which is one the main task in data mining, is getting popular day by day. In simple words it can be defined as the process of organizing similar data objects into groups. It is considered as one among the unsupervised learning methods, which yields result without using any prior external supervision. Any Clustering technique, organized a given data set into several groups such that the similarity within a group is greater as compared to among other groups. In this paper, a review of different Hard and Fuzzy Clustering technique is done such as: Kmeans, KMedoid, Fuzzy CMeans, GustafsonKessel, and GathGeva. The different techniques are implemented and tested against a normally distributed motorcycle dataset. Their performances with respect to speed and accuracy of each of the techniques are compared accordingly which includes their validation parameters as well.
Index Terms data clustering, kmeans, k Medoid, fuzzy cmeans, GustafsonKessel, Gath Geva.

INTRODUCTION
It is remarkable to think how this particular field has find its ways in numerous uncountable applications such as Image processing, Biological research, web documentation and many more. Data Clustering itself is well known for its interesting approach of finding similarities between data and organizing them into groups. However it does not only means organizing datas into groups but they are used for data compression and model induction as well as extraction of meaningful information As mentioned earlier Clustering is nothing but partition of given data sets into clusters such that there is a larger similarity between data in the same group as compared to those among other groups. Clustering in
its approach reflects very much from our daily lives. For example so often we group our books of similar sizes together in one side of our shelves, our mother used to organize similar household utensils one side of our almaris and so on. This is not the point to be noted, if then clustering would look very simple and predictable. But living in a digital world things normally dont go easy, data tends to accumulate and multiply rapidly over time. This leads to the challenge, of how to organize these datas. Here is where soft computing methods come into usefulness which helps us in clustering those huge datas.. In this paper, a brief review is done on the performance of the various Hard as well as Fuzzy Clustering algorithms. Hard algorithms such as K Means, KMedoid and Fuzzy algorithms such as C Means, GK(GustafsonKessel), GG(GathGeva). These five techniques are implemented and their performance comparison is comprehensively done to

RELATED WORKS
There has been a tremendous growth of interest in the field of soft computing, starting from the development of KMeans to the development of fuzzy CMeans algorithms. J. C. Bezdek [1] in his book Pattern Recognition with Fuzzy Objective Function Algorithms has given a comprehensive detail of all the fuzzy clustering algorithms. F. Hoppner, F. Klawonn, R. Kruse, and T. Runkler[2] in another book gave a brief idea of Fuzzy Cluster Analysis. R. Babuka[3] formulate all the clustering algorithms. D.E. Gustafson and W.C. Kessel [4] proposed the Fuzzy clustering with fuzzy covariance matrix. T.Velmurugan and T.Santhanam [5] describe the affect of normal distribution of data points on the Hard clustering algorithms. R. Babuka, P. J. van der Veen, and U. Kaymak[6] proposed the Improved covariance estimation for GustafsonKessel clustering. .D .Napolean et al[6] proposed an efficient
Clustering Technique to Reduce the time Complexity Of KMeans Using Normal Distribution of data points. J.C. Bezdek and J.C. Dunn [8], I. Gath and
A.B [9], A.M. Bensaid et al [10] are some of the other works that has explore finding out the validation parameters of the various clustering algorithms. The implementation and analysis id one with the used of fuzzy toolbox by Balazs Balasko[11]. .

DATA CLUSTERING OVERVIEW
So as to get a taste of the coming sections, lets just have an overlook, about data clustering. Lets start from the term cluster itself. Many definitions of cluster have been formulated, but all solely depend on the objective of clustering. Among those the most accepted one is that cluster, is nothing but a group of objects that are more similar to one another than to members of other clusters. Here the term Similarity here should be understood clearly that it is
Clustering as {Ai1 i c P(x)}. Its properties are as follows:
i=1
i=1
c Ai=X
Ai Aj,1 i j c,
Ai X, 1 i c.
The first condition above states that the subsets Ai contain all the data in X, second one states that they must be disjoint and the last none of them must be empty nor should contains all the data in X. The membership function of Hard partition is given below:
i=
i=
i
i
Vc 1A = 1,
Ai Aj, 1 i j c
0< Ai < 1, 1 i c.
Here Ai is the characteristic function of the subset Ai and can have values either zero or one. In matrix notation the partition can be represented as follows:
A NÃ—c matrix U = [ik ] represents the hard partition if and only if its elements satisfies
mathematical similarity, purely defined by distance
Norm. In general, clusters are subsets of a data set. The major issue now is whether those subsets formed are hard (crisp) or fuzzy. Hard clustering methods are based on classical set theory, which requires an
Aij
0,1, 1 i N, 1 k c,
c
Aik = 1,1 i N,
k=1
k
object either does or does not belong to a cluster; ie subsets are mutually exclusive. Fuzzy clustering methods on the other hands allow objects to belong to several clusters simultaneously, but with deferent degrees of membership. In many real situations, fuzzy clustering is more natural than hard clustering, as objects on the boundaries between several classes are not forced to fully belong to one of the classes, but rather are assigned membership degrees between 0 and 1 indicating their partial memberships.
u1,1 u1,2 u1,3
u1,c
u2,1 u2,2 u2,3 u2,c
uN,1 uN,2 uN,3 uN,c
u1,1 u1,2 u1,3
u1,c
u2,1 u2,2 u2,3 u2,c
uN,1 uN,2 uN,3 uN,c
The structure of the partition matrix U = ik
U=

Hard partition
0 < Aik < N, 1 k c.
k=1

Fuzzy partition
Fuzzy partition can be viewed as a generalization of hard partition. The only difference is that it allows ik attaining real values between the range [0, 1]. An NÃ—c matrix U = [ik ] represents the fuzzy partitions. Its condition is given below:
ij [0,1], 1 i N, 1 k c,
c
ik = 1,1 i N,
k=1
k
0 < ik < N, 1 k c.
k=1
One thing one needs to be clear that the distribution of memberships among the c fuzzy subsets in case of fuzy partition is not constrained. Hence data at the borders are not forced to belong to only one cluster.
As we are dealing with clustering, one must be clear
with the objective that is to partition the data set X into c clusters. Any Hard Partition can be defined as a family of subsets. The classical set represents Hard


DATA CLUSTERING TECHNIQUES
c 1 ij
, j = 1 n
i=
i=

Kmeans and KMedoid algorithms
The popular classes of Hard Partitioning
The cost function for FCM is a generalization of Equation given for Hard clustering algorithms.
algorithm are KMeans and KMedoid. These classes c N
of partitioning methods are simple, popular and very widely used. But, both have a big drawback that is both does not give any fair reliable results every time and have some numerical problems as well. For any NÃ— n dimensional data set Kmeans and KMedoid algorithms is purely based on finding data clusters in a data set by minimizing the withincluster sum of squares given below which basically denotes a
distance norm:
J X; U, V = (ik )m Xi Vi 2
i=1 k=1
Where ik is between 0 and 1; Vi is the cluster center of fuzzy group, dij = Xi Vi 2 is the Euclidean
distance between the ith cluster center and the jth data point; and m[1,) is a weighting exponent.
ikA A
ikA A
Where, D2 = xk vi 2 =(xkvi)TA(xkvi) is a
c
Xi Vi 2
squared innerproduct distance norm. The necessary conditions for the above equation to reach its minimum are:
i=1 KAi
Where A a set of data is points or objects and V is
Vi =
N
j=1
N
(ij )m Xj (ij )m
i i j=1
the mean for that data points over any cluster. In K
And
= 1
means clustering Vi is also called the cluster prototypes, its general form is
ij c d ij 2/ m 1
k=1 d kj
Vi =
K =1 XK
N
N
Ni
, XK Ai
The algorithm works iteratively through the preceding two conditions until the no more improvement is noticed. The FCM algorithm
Where Ni is the number of objects in Ai.In Kmedoid clustering the cluster centers are to the nearest objects to the mean of data in one cluster
V = Vi X1 i c .

Fuzzy Cmeans algorithm
Fuzzy Cmeans clustering (FCM), relies on the basic idea of Hard Clustering(HC), with the difference that in FCM each data point belongs to a cluster based on a degree of membership, while in HC every data point either it belongs or not to a certain cluster. So FCM employs fuzzy partitioning such that a given data point can belong to several groups with the degree of belongingness specified by membership grades between the range 0 and 1. However, FCM still uses a cost function that is to be minimized while trying to partition the data set.
The membership matrix U is allowed to have elements with values between 0 and 1. However, the summation of degrees of belongingness of a data point to all clusters is always equal to unity:
computes with the standard Euclidean distance norm, which induces hyper spherical clusters. Hence it can only detect clusters with the same shape and orientation.

The GustafsonKessel (GK) algorithm:
The basic problem of FCM algorithm is that it does not detect clusters of different shapes. In order to solve this problem Gustafson and Kessel extended the standard fuzzy cmeans algorithm by employing an adaptive distance norm, in order to detect clusters of different geometrical shapes. Each cluster has its own norminducing matrix Ai, which yields the following innerproduct norm:
ikA
ikA
D2 =(Xkvi)T Ai(xkvi) ,1 i c, 1 k N.
The matrices Ai are used as optimization variables in the cmeans functional, thus allowing each cluster to adapt the distance norm to the local topological
structure of the data. Let A denote a ctuple of the
norminducing matrices: A= (A1, A2 Ac) the objective functional of the GK algorithm is defined by:
than the innerproduct norm. Fwi denotes the fuzzy covariance matrix of the ith cluster, given by:
c N m 2
=
N
k=1
(ik )w (xkvi) (xkvi )T
N
, 1 i c
J(X; U, V,A) = i=1 k=1(ik )
DikA i
k=1(ik )w
For any fixed A, fuzzy partition conditions can be directly applied. However, in the objective function given above, since it is linear in Ai it cannot be directly minimized w.r.t Ai. Which means J can be made as small as we want by simply making Ai less positive definite. Ai must be constrained in some way to get a feasible solution, the usual way to accomplish this is to constrain the determinant of Ai. Allowing the matrix Ai to vary with its determinant fixed corresponds to optimizing the cluster's shape while its volume remains constant:
Ai = i , > 0
Where w = 1 in the original FMLE algorithm, but here we use the w = 2 weighting exponent, so as to compensate the exponential term of the distance norm and the partition to become more fuzzy. The difference between the matrix Fi in GK algorithm and the Fwi define above is that the latter does not involve the weighting exponent m, but instead consists of w. This is because the two weighted covariance matrices arise as generalizations of the classical covariance from two different concepts. The i is the prior probability of selecting cluster is given by
Where i is fixed for each cluster. Using the
N
1
i = N ik
Lagrange multiplier method, the following expression for Ai is obtained:
A = det(F ) 1 n F1 Where Fi is the fuzzy
k=1
The membership degrees ik are interpreted as the posterior probabilities of selecting the ith cluster.
i i i i
covariance matrix of the ith cluster defined by
N 1(ik )m (xkvi) (xkvi )T
Algorithms:

KMeans algorithm:
k=1
k=1
Fi= k =
N (ik )m
Note that the
Given the data set X, choose the number of clusters 1 < c
< N.
Initialize with random cluster centers chosen from the data set.
Repeat for l=1,2,3.
Step 1 Compute the distances
Given the data set X, choose the number of clusters 1 < c
< N.
Initialize with random cluster centers chosen from the data set.
Repeat for l=1,2,3.
Step 1 Compute the distances
substitution of Fi and Ai gives a generalized squared Mahalanobis distance norm between X k and the cluster mean vi , where the covariance is weighted by the membership degrees in U.
2
2
=(xkvi)TA(xkvi), 1 i c ,1 k N
=(xkvi)TA(xkvi), 1 i c ,1 k N
D. The GathGeva (GG) algorithm:
Step 2 Select the points for a cluster with the minimal distances,
they belong to that cluster.
Step 3 Calculate cluster centers
Step 2 Select the points for a cluster with the minimal distances,
they belong to that cluster.
Step 3 Calculate cluster centers
=
=
() =
() =
1
1
Until
() (1) 0
=1
Ending: Calculate the partition matrix.
Until
() (1) 0
=1
Ending: Calculate the partition matrix.
We have seen how GK algorithm has considerably extent its effort to solve FCM limitation. GG algorithm is yet another extension done on GK algorithm by taking the size and density of clusters into account. In here the Clustering algorithm, employs a distance norm based on the fuzzy maximum likelihood estimates, proposed by Bezdek and Dunn.
D x , v = det (Fwi )exp 1 (x
v l )TF1(x
v l )
ik k i i
2 k i
wi k i
In contrast to GK algorithm, ths distance norm involves an exponential term thus decreases faster
C.FCM Algorithm
C.FCM Algorithm
D2 =(x v *)TA(x – v *),
D2 =(x v *)TA(x – v *),
i
i
i ik
i ik
i
i
i
i
B.KMedoid Algorithm:
Given the data set X, choose the number of clusters 1 < c
<
N.Initialize with random cluster centers chosen from the data setX.
Repeat for l= 1,2,3,.
Step 1 Compute the distances
Given the data set X, choose the number of clusters 1 < c
<
N.Initialize with random cluster centers chosen from the data setX.
Repeat for l= 1,2,3,.
Step 1 Compute the distances
ikA
k i
k i
ikA
k i
k i
And x=argmin D2 v l = x
And x=argmin D2 v l = x
Step 2 Select the points for a cluster with the minimal distances,they belong to that cluster.
Step 3 Calculate fake cluster centers(the original K means)
Step 2 Select the points for a cluster with the minimal distances,they belong to that cluster.
Step 3 Calculate fake cluster centers(the original K means)
=
=
Step 4 Choose the nearest data point to be the cluster center
Step 4 Choose the nearest data point to be the cluster center
=1
Ending Calculate the partition matrix
=1
Ending Calculate the partition matrix
=1
=1
2
2
=(xkvi)TA(xkvi)
=(xkvi)TA(xkvi)
Until () (1) 0
Until () (1) 0
()
()
=1((1))
=1((1))

FCM Algorithm
Given the data set X, choose the number of clusters 1 < c
< N,
the weighting exponent m > 1, the termination tolerance > 0
and the norminducing matrix A. Initialize the partition matrix randomly, such that(0) .
Repeat forl=1,2,3.
Step 1 Compute the cluster prototypes (means):
Given the data set X, choose the number of clusters 1 < c
< N,
the weighting exponent m > 1, the termination tolerance > 0
and the norminducing matrix A. Initialize the partition matrix randomly, such that(0) .
Repeat forl=1,2,3.
Step 1 Compute the cluster prototypes (means):
=
=

GG Algorithm
Given a set of data X specify c, choose a weighting exponent m > 1 and a termination tolerance > 0. Initialize the partition matrix with a more robust method.
Repeat for l = 1, 2,.
Step 1 Calculate the cluster centers:
Given a set of data X specify c, choose a weighting exponent m > 1 and a termination tolerance > 0. Initialize the partition matrix with a more robust method.
Repeat for l = 1, 2,.
Step 1 Calculate the cluster centers:
to the prototype is calculated based the fuzzyCovariance
matrices of the cluster.
to the prototype is calculated based the fuzzyCovariance
matrices of the cluster.
Fl =
N
ik
N
k il k il
, 1 i c
Fl =
N
ik
N
k il k il
, 1 i c
D2 Xk , vi =
(2) 2 det Fi i
exp Xk
D2 Xk , vi =
(2) 2 det Fi i
exp Xk
v F X
v(l)
v F X
v(l)
with the a priori probability
with the a priori probability
=1
Step 3 Update the partition matrix
=1
Step 3 Update the partition matrix
( (,)/ (, ))2/( 1) , 1
( (,)/ (, ))2/( 1) , 1
, 1
Until
, 1
Until
() (1) <
() (1) <
v =
v =
(l) k = ik
(l) k = ik
N 1 l1 Xk
N 1 l1 Xk
i
i
N
N
, 1 i c
, 1 i c
k=1
k=1
l1
l1
ik
ik
Step 2 Compute the distance measure 2 The distance
Step 2 Compute the distance measure 2 The distance
k=1
k=1
l1 X V X V T
l1 X V X V T
i
i
k=1
k=1
l1
l1
ik
ik
n
n
1
1
ik
ik
2
2
(l)
(l)
i
i
T
T
1
1
i k
i k
i
i
= 1
= 1
= 1
= 1
=1
=1
=1
=1
((1)) , 1 i c
((1)) , 1 i c
Step 2 Compute the distances:
Step 2 Compute the distances:
2
2
=(xkvi)TA(xkvi), 1 i c ,1 k N
=(xkvi)TA(xkvi), 1 i c ,1 k N
Step 3 Update the partition matrix:
Step 3 Update the partition matrix:
() =
() =
1
1
,
,
2 1
2 1
.
.
=1
=1
D
D
jkA
jkA
Given the data set X, choose the number of clusters 1 < c < N,
the weighting exponent m > 1, the termination tolerance > 0
and the norminducing matrix A. Initialize the partition matrix randomly, such that (0) . Repeat for l= 1, 2,3.
Step 1 Calculate the cluster centers.
Given the data set X, choose the number of clusters 1 < c < N,
the weighting exponent m > 1, the termination tolerance > 0
and the norminducing matrix A. Initialize the partition matrix randomly, such that (0) . Repeat for l= 1, 2,3.
Step 1 Calculate the cluster centers.
()
()
=1((1))
=1((1))
=1
=1
E.GK Algorithm
Until () (1) <
=
((1)) , 1 i
Until () (1) <
=
((1)) , 1 i
2. Classification Entropy (CE): it measures the fuzziness of the cluster partition only, which is similar to the Partition Coefficient
Step 2 Compute the cluster covariance matrices.
Step 2 Compute the cluster covariance matrices.
=
=
()
()
( ) ( )(
( ) ( )(
1
1
()
()
()
()
=1
=1
)
)
, 1 i c
, 1 i c
CE(c)= 1 c 1 N 1 ij log ij
N i= j=
=1
=1
( 1 )
( 1 )
Add a scaled identity matrix: = 1 +
(0)1 I
Extract eigenvalue and eigenvectors ,Find
, = , and set
, = , for which, ,
Reconstruct Fi by:
Fi= [i,1 . . i,n ]diag(,1 . , )[i,1 . . i,n ]1
Step 3 Compute the distances.
Add a scaled identity matrix: = 1 +
(0)1 I
Extract eigenvalue and eigenvectors ,Find
, = , and set
, = , for which, ,
Reconstruct Fi by:
Fi= [i,1 . . i,n ]diag(,1 . , )[i,1 . . i,n ]1
Step 3Compute the distances.
.3. Partition Index (SC): is the ratio of the sum of compactness and separation of the clusters. It is a sum of individual cluster validity measures normalized through division by the fuzzy cardinality of each cluster.
N ( )m (x v 2
SC(c)= c j=1 ij j i) , SC is useful
Ni
Ni
2
2
i=1 c
()
()
=
=
1
( ( , )/ (, ))2/( 1)
1
( ( , )/ (, ))2/( 1)
k=1
(ij )m (vkvi)
2 ( , )=( ) ( det( ))1/ 1
2 ( , )=( ) ( det( ))1/ 1
Step4 Udpate the partition matrix
Step4 Udpate the partition matrix
, 1
, 1
Until
() (1) <
Until
() (1) <
=1
=1
, 1
, 1



Validation Parameter:
when comparing different partitions having equal number of clusters. A lower value of SC indicates a better partition.

Separation Index (S): on the contrary of partition index (SC), the separation index uses a minimum distance separation for partition validity.
2
2
i,k (v v
i,k (v v
ci =1 Nj =1 Nj=1(ij )2 (xjvi)
Cluster validity refers to the alternate solution to find
S(c)=
Nmin k i) 2
out whether a given clustering partition fits to the data well or not. The clustering algorithm always tries to find the best fit for a fixed number of clusters and the parameterized cluster shapes. However this does not mean that even the best fit is meaningful for
all cases. Either the number of clusters might be

Dunn's Index (DI): this index is originally proposed to use at the identification of "compact and well separated clusters". So the result of the clustering has to be recalculated as it was a hard partition algorithm.
wrong or the cluster shapes might not correspond to
DI(c)= min
{min
min xCi,y Cjd x ,y
},
the groups in the data, if the data can be grouped in a
i c
jc,ij
max k c {max x ,y C d(x,y)
meaningful way at all. There is different scalar validity measures have been proposed in the literature, and one should be clear none of them is perfect by oneself; therefore we used several of those validity measures known as indexes in our validation.
1. Partition Coefficient (PC): measures the amount of "overlapping" between clusters. It is defined by Bezdek as follows:
The main drawback of Dunn's index is computational since calculating becomes computationally very expansive as c and N increase.


Implementation and Results:
As now we have a general idea of all the Hard as well as Fuzzy clustering, which include their basic mathematical foundations. We now turn our
PC(c) = 1
=1
=1
(
)2 , where
is the
discussion to all these techniques on the basis of practical study. This study includes the practical
membership of data point j in cluster i.The disadvantage of PC is lack of direct connection to some property of the data themselves. The optimal number of cluster is at the maximum value.
implementation of all the five clustering techniques and testing each one of them on a set of data called motorcycledataset.
Fig1 Clusters using Kmeans Algorithm
Fig2 Clusters using Kmedoid Algorithm
Fig3 Cluster using FCM Algorithm
Fig4 Clusters using GK Algorithm
Fig5 clusters using GG Algorithm Table1Comparision of Index parameters for
PC
CE
SC
S
DI
K
Means
1
NaN
0.085
0.0001
0.0129
K
Medoid
1
NaN
0.2384
0.0002
0.0031
FCM
0.8152
0.3480
0.9157
0.0007
0.0185
GK
0.8255
0.3278
0.8890
0.0007
0.0083
GG
0.9837
0.0287
2.2349
0.0020
0.0153
PC
CE
SC
S
DI
K
Means
1
NaN
0.085
0.0001
0.0129
K
Medoid
1
NaN
0.2384
0.0002
0.0031
FCM
0.8152
0.3480
0.9157
0.0007
0.0185
GK
0.8255
0.3278
0.8890
0.0007
0.0083
GG
0.9837
0.0287
2.2349
0.0020
0.0153
different algorithms
CLUSTERING
Speed
in
Root Mean
Accuracy
ALGORITHM
terms
of
Square Error
Number
(RMSE)
of
Iteration
KMeans
3
0.443
81%
KMedoid
2
0.441
78%
FCM
42
0.437
69%
GK
70
0.437
77%
GG
27
0.439
79%
CLUSTERING
Speed
in
Root Mean
Accuracy
ALGORITHM
terms
of
Square Error
Number
(RMSE)
of
Iteration
KMeans
3
0.443
81%
KMedid
2
0.441
78%
FCM
42
0.437
69%
GK
70
0.437
77%
GG
27
0.439
79%
Table2 Comparison of Speed and Accuracy for different algorithm
100
80
60
40
20
0
KMean
KMedoid FCM
GK
GG
100
80
60
40
20
0
KMean
KMedoid FCM
GK
GG
speed Accuracy
speed Accuracy
Fig6 Comparison of speed and accuracy
0.444
0.442
0.44
0.438
0.436
0.434
KMeans
KMedoid FCM
GK
GG
0.444
0.442
0.44
0.438
0.436
0.434
KMeans
KMedoid FCM
GK
GG
RMSE
RMSE
Fig7 Comparison of RMSE
As mentioned earlier the similarity norm that is used is purely mathematical distance norm. However because most similarity metrics are very sensitive to large ranges of elements in the input vectors, so each of the input variable is normalized first within the unit interval [0, 1] or hypercube. After normalization each algorithm is tested on the normalized data set. The resultant cluster is then evaluated using the parameters mentioned above. The analysis of each of the algorithm is given below with some of their remarks:
KMeans runs considerably very fast by looking at the number of iteration it takes from above table 2, than other algorithms, but one limitation it has is that it is very sensitive to initial centroids. That is why it is recommended to run the algorithm repeatedly to yield better the best results. Moreover the value of K still needs to be given by the user however regarding accuracy it excel as compared with other algorithms.
KMedoid performs almost same as that of KMeans algorithm. Its speed and accuracy are fairly encouraging and needless to say it is very very fast.
Coming to FCM it starts normally like the KMeans algorithm by assigning random values to the membership matrix U. Looking at the fig 3 above we can see that FCM can detect only clusters of circular shapes however the clusters are a little bit elongated because of the direct affect of one cluster over the other. In table 2 we can see it is comparatively slower than the Hard algorithms accuracy seems to be a problem too. Even here the number of clusters is not known before hand.
On running GustafsonKessel from fig 4 we can see it detect clusters not only of spherical shaped but of different geometrical shapes as well. The Mahalanobis distance used here adapt the topological structure because each cluster is forces to have its own norm inducing matrixAi. since it needs to be initialized by PCA it takes more time and hence slower however accuracy seems to better than that of PCA with same RMSE.
GathGeva algorithm uses the PCA for initialization like Gk. However because of the exponential term in the distance norm, it decreases faster by increasing
Xk Vi distance that is fairly visible from table 2 by looking at the number of iteration it takes.. In Fig. 4 The GG divides the data point into several disjoint clusters but the effects of the other "clusterborders" distort this disjointness. Its accuracy is better than all the above fuzzy algorithms mentioned above.

Comparative Analysis based on index Parameters:
From fig 1 to fig 5 shown above the various index values mentioned in Section 1V can be easily demarcated and compared for each clustering methods. All the validity measures are collected in Tab. 1. It must be noted, since all algorithms use random initialization, so different runnings issue results in different partition results, i.e. values of the validation measures. On the other hand the results hardly depend on the structure of the data, and no validity index is perfect by itself for a clustering problem. The figures show that hard clustering methods can find a good solution for the clustering problem, when it is compared with the figures of fuzzy clustering algorithms. On the contrary in Fig.1 and Fig.2 one can see a typical example of the
initialization problem of hard clustering. This caused the differences between the validity index values in Tab.1 PC and CE is useless for Kmeans and K medoid, because they are hard clustering methods. But SC,S, and DI are useful to validate crisp and well separated clusters.. The only difference between Fig. 3 and Fig. 4 stands in the shape of the clusters, where the GustafsonKessel algorithm can find the elongated clusters better obvious by looking at the parameters in tab.1. Fig.5 shows that the Gath Geva algorithm returned with a result of three subspaces.
1.2
1 KMEANS
0.8
upper hand regarding speed and other issues such as accuracy but it is important to keep in mind about the importance of fuzziness as well when it comes to application in real life. We have made efforts in the analyzing each clustering algorithm even in terms of index parameters and how they help in describing and explain the cluster behavior.
REFFERNCES:

J. C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, 1981.

F. Hoppner, F. Klawonn, R. Kruse, and T. Runkler. Fuzzy Cluster Analysis. Wiley, Chichester, 1999.

R. Babuska. Fuzzy Modeling for Control.
0.6
0.4
0.2
0
PC CE SC
KMEDOID
FCM GK GG
Kluwer Academic Publishers,Boston, 1998.

D. E. Gustafson and W.C. Kessel. Fuzzy clustering with fuzzy covariance matrix In a mixture of normal dustrubutions. IEEETransactions on Computers, pp. 835838, 1975.

T. Velmurugan and T.Santhanam Computational complexity between Kmeans and KMedoids Clustering algorithm for Normal and Uniform Distrubution of data points. Journal of Computer
Fig8Comparison of PC, CE and SC for different
algorithms
0.02
0.015
0.01
0.005
0
KMeans
Kmedoid FCM
GK
GG
0.02
0.015
0.01
0.005
0
KMeans
Kmedoid FCM
GK
GG
DI S
DI S
Fig9 Comparison of DI and S for different algorithms
IX. Conclusion:
In this paper a brief review has been done on the various Hard and fuzzy clustering algorithm. It is worth knowing that each of them has their own share of pros and cons. Even though at times it may seems that Hard clustering algorithm may seems to have an
science 6(3): pp.363368,2010

R. Babuska, P. J. van der Veen, and U. Kaymak. Improved covariance estimation for Gustafson
Kessel clustering. IEEE International Conference on Fuzzy Systems, pp.10811085, 2002.

D .Napolean et al. An efficient Clustering Technique to Reduce the time Complexity Of K Means Using Normal Distribution of data points International Journal Of Advanced Research in Computer Science, vol.2 ,no1, pp. 491494,2011

J. C. Bezdek and J.C. Dunn. Optimal fuzzy partitions: A heuristic for estimating the parameters in a mixture of normal distributions. IEEETransactions on Computers, pp. 835838, 1975

I. Gath and A.B. Geva. Unsupervised optimal fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.7, pp.773 781, 1989.

A. M. Bensaid, L.O. Hall, J.C. Bezdek, L.P. Clarke, M.L. Silbiger, J.A.Arrington, and R.F. Murtagh. Validityguided (Re)Clustering with applications to image segmentation. IEEE Transactions on Fuzzy Systems, vol.4, pp. 112123, 1996.