A New Approach for Image Feature Vector Classification using Unsupervised Clustering Method

DOI : 10.17577/IJERTCONV2IS13132

Download Full-Text PDF Cite this Publication

Text Only Version

A New Approach for Image Feature Vector Classification using Unsupervised Clustering Method

Mr.Sreedhar Kumar.S

Assoc Professor, Department of CSE,

Don Bosco Institute of Technology, Bangalore, India E-mail: sree_me_261177@yahoo.co.in

Mrs.Shilpa.S

MTech Student, Department of CSE,

Don Bosco Institute of Technology, Bangalore, India E-mail: shilly.shankar@gmail.com

Abstract- In this paper, we introduce a new approach for classification of image feature vector using unsupervised clustering technique. The proposed approach is aimed to partition the trained image feature vector into highly relativek clusters. The proposed approach system consists of two stages viz (1) Image preprocessing stage (2) Classification. Image preprocessing stage involves feature extraction and feature selection stage. In feature extraction stage, applied spatial statistical operators and features such as mean, standard deviation and variance are extracted. In the feature selection stage, the size of the image feature vector is limited. In the classification stage, the trained image feature vector set is partitioned into k highly relative clusters through k-means technique. The experimental result shows that the proposed approach is very suitable for partitioning and for classification of the image feature vector with good accuracy.

Keywords – Statistical operators, Unsupervised Clustering.

  1. INTRODUCTION

    Classification of images can be done in supervised and unsupervised way. Supervised classification starts by specifying an information class on the image. An algorithm is then used to summarize multispectral information from the specified areas on the image to form class signatures. This process is called supervised training. For the unsupervised classification, however an algorithm is first applied to the image and some spectral classes (also called clusters) are formed. The image analyst then tries to assign a spectral class to the desirable information class

    An efficient method to identify and classify the exudates as hard and soft exudates was introduced in [1]. Candidate exudates were detected using k-means clustering technique in this system. A hybrid clustering which is combination of k-means and PSO clustering was introduced by [2] for classification of multispectral images. Here k- means clustering was done initially and the result was used to see the initial swarm. In [3],have proposed a new unsupervised classification approach for automatic analysis of polarimetric synthetic aperture radar (SAR) image. Here the classification of multidimensional SAR data space by dynamic clustering was addressed as an optimization problem. A different approach was proposed by [4], they used supervised features in the context of image classification and retrieval yielding excellent results and demonstrated how these supervised features can be

    effectively used for unsupervised image categorization that is for grouping semantically similar images. In [5], they have proposed a new multistage method for unsupervised image classification using hierarchical clustering. In the first phase, the multistage method performs segmentation using a hierarchical clustering procedure which confines merging to spatially adjacent clusters and generates an image partition. In the second phase, the segments resulting from the first stage are classified into a small number of distinct states by a sequential merging operation.

    In [6], have proposed a New Fuzzy Cluster Centroid (NFCC) for unsupervised classification algorithm to improve the traditional FCM and fuzzy weighted c means (FWCM) algorithm. They reported that the inclusion of the fuzzy centroid for each cluster has increased the stability of the algorithm and the inclusion of the new term reduces the number of iterations for image classification. Frank Y. Shih et.al. [7] Have proposed a new two-pass unsupervised clustering algorithm incorporated the fuzzy theory for classification of land sat remote sensing images. In the first pass they derived the mean vectors of different land cover types representing their geographic attributes and in the second pass the membership grade of a pixel belonging to different land cover types is computed based on the distance between its gray-value vector and the mean vector of each type. Beaulieu J.M. et.al.[8] have proposed a method which includes segmentation and classification of image. They applied clustering over segment mean values and considered only large segments. They reported the method was very efficient in simplifying the image. In [9] have proposed unsupervised image classification algorithms using a hierarchical model. In this approach the only parameter supposed to be known is the number of regions, all the other parameters are estimated. The algorithms they presented was implemented on a Connection Machine CM200. They reported comparative tests were done on noisy synthetic and real images (remote sensing). Another approach was proposed in [10], they derived an EM algorithm on the hybrid structure which mixes an exact EM algorithm on each subtree and a low cost Gibbs EM algorithm on the coarse spatial grid. Experiments on a synthetic image and multispectral satellite images are reported.

    In this paper we have focused on a new approach for image feature vector classification using unsupervised k- means clustering. Proposed approach partitions the trained image feature vector into highly relative k clusters. The

    major goal of the proposed technique is to partition and classify image feature vector with good accuracy.

  2. THE PROPOSED SYSTEM

    In this section, the detail of the proposed system is presented. The proposed system consists of two stages viz

    1. Image preprocessing stage and

    2. Classification.

      Fig 1 shows the steps involved in the proposed system. The above two stages are described in the below subsection.

      2.1 Image preprocessing stage

      Image preprocessing stage involves two steps viz

      1. Feature Extraction and ii. Feature Selection.

    2.1.1 In the Feature Extraction step, the input image is divided into blocks of smaller size. On each block, applied spatial statistical operators and extracted the features such as mean, standard deviation and variance. For example each input image of 200*200 size shown in Fig 2 (which consists of 20 samples) are divided into 625 blocks of size 8*8 and features such as mean, standard deviation and variance are extracted as shown in Fig 4. Mean is defined as

    Input Image

    Raw grayscale image set

    Preprocess the image in digital from

    Split the image into 8*8 blocks

    Extract the features through statistical operators

    Output clusters

    Classification of image feature vector set into k clusters by k- means

    k1

    kn

    k2

    Step 2: Classification

    Limit the feature vector size by feature selection method

    R

    C

    N

    Image preprocessing stage

    Ikij

    X k 0 i0 j0

    I I and I I

    (1)

    k R C

    kij k k

    Trained image feature vector set

    Standard Deviation is defined as

    1

    N R C 2 2

    Step 1:

    Ikij X k

    SD k 0 i0 j0

    k R C

    Variance is defined as

    N R C 2 12 2

    1

    Vk Rkij X k

    (2)

    (3)

    Image preprocessing step (Feature extraction and selectio)

    Fig 1. Functional diagram of the proposed system .

    2.2 Classification

    In this stage, the trained image feature vector set is given as an input to k-means algorithm which classifies the vector set into highly relative k clusters. The k-means clustering algorithm, which is also called the generalized

    k 0 i0 j0

    Lloyd algorithm (GLA), is a special case of the generalized hard clustering scheme, when point representatives are

    Where X k represents the mean of kth block that belong to image I

    Ikij denoting the kth block and ith row and jth column pixel

    that belong to an image I and

    R and C respectively denote the row and column size of the kth block in the image I.

    2.1.2 In Feature Selection stage, the feature vector size is limited. Among all the features only means values of all input images are taken as shown in Fig 5.

    adopted and the squared Euclidean distance is used to measure the distortion (dissimilarity) between a data point X and its cluster representative (cluster center) C. The k- means clustering algorithm performs iteratively the partition step and new cluster center generation step until convergence. An iterative process with extensive computations is usually required to generate a set of cluster representatives. As shown in Fig 6 trained image feature vector set are fed as an input to k-means algorithm, which in turn clusters the input images.

  3. K-MEANS TECHNIQUE

    In this section, the detail of the k-means technique is presented. The idea is to classify the image feature vector set into k clusters, where k is the input parameter specified in advance through iterative relocation technique which converges to local minimum. It consists of two phases.

    End

    5. Until no changing of cluster centers or criterion function becomes minimum using Eq.(5)

  4. EXPERIMENTS AND RESULTS

    1. First, randomly select K={K1,K2Kn} as candidate representative representing initial cluster centers.

    2. Next phase is to determine the distance between the data in the image feature vector set and cluster centers and assigning the feature data to its nearest cluster. Euclidean distance is considered to determine the distance and if defined as

In this section, we tested the proposed system over 20 sample images that are shown in Fig 2 obtained from [11]. These sample images are grayscale images of size 200*200. Each pixel in the image corresponds to some characteristic at that point in space, for example protein density in MRI or X-ray absorption in CT scans. These images stack together to form a solid representation of the head.

n K

2 12

d ( X i , c j) = X i c j

Image 1 Image 2 Image 3 Image 4 Image 5

Image 6 Image 7 Image 8 Image 9 Image 10

Image 11 Image 12 Image 13 Image 14 Image 15

Image 16 Image 17 Image 18 Image 19 Image 20

(4)

i1 j1

Where d( X i , cj) denotes the distance between

data in trained image feature vector set X and all the cluster centers K.

When all the feature data are included in some clusters an initial grouping is done. New clusters are then calculated by taking average of feature data in the clusters. This is done because inclusion of new feature data may lead to changes in the cluster centers. This process of center updation is iterated until a situation where centers do not update anymore or criterion function becomes minimums.

Square error criterion is used which is defined by,

E= k i1

pCi

p mi

(5)

Fig 2. Input images for the proposed system.

Where p is feature data and mi is the center for cluster ci.

E is sum of squared error of all data in image feature vector set.

Algorithm

Input: Trained Image feature vector set of n images

X i (i=1 to n)

Number of clusters K= {K1, K2.. Kn}.

Output: n images clustered into K Clusters.

It is clearly noticed from the Fig 3, the input image is divided into 625 blocks of 8*8 size and pixel values of all blocks are obtained.

Begin

  1. Randomly select K={K1,K2..Kn} from image

    feature vector set X as candidate representative representing initial clusters.

  2. Repeat

  3. Calculate the distance between

    X i (i=1 to n)

    and all K cluster centers Cj (j=1 to K) using

    Euclidean distance Eq.(4) and assign nearest cluster j.

    X i to

  4. For each cluster j recalculate the cluster

center using average mean function.

Fig 3.Input image of 200*200 size is divided into 625 blocks of 8*8 size [Feature Extraction]

In feature extraction stage, applied statistical operators and features such as mean, standard deviation and variance are obtained from all the 625 blocks as shown in Fig 4.

Fig 4. Features like mean, standard deviation and variance are extracted From each image block [Feature Extraction].

During feature selection, the image feature vector set size is limited and only mean feature of all the blocks are obtained as shown in Fig 5.

Fig 5. Mean values of the input images.

Finally the trained image feature vector set consisting of average mean value of all the input images obtained in Fig 6 are given as an input to k-means technique and highly

relative 3 clusters K1,K2,K3 are obtained as shown in Fig 7 and Fig 8.

Fig 6.Trained image feature vector set which is fed as an input to k-means

Fig 7.Clusters of mean values as an final output.

Image 3

Image 5

(a) Cluster K1

Image 1 Image 2

Image 4

(b) Cluster K2

Image 6 Image 7 Image 8 Image 9 Image 10

Image 11 Image 12 Image 13 Image 14 Image 15

Image 16 Image 17 Image 18 Image 19 Image 20

(c) Cluster K3

Fig 8.Clusters of images.(a) Cluster k1(b) Cluster k2 (c) Cluster k3.

IV. CONCLUSION AND FUTURE WORK

In this paper, we introduced a new approach for image feature vector set classification using unsupervised clustering technique. The proposed approach is aimed to partition the trained image feature vector set into highly relative clusters. The proposed approach consists of two stages viz. (1) Image preprocessing stage (2) Classification stage. The image preprocessing stage aims to train the limited image feature vector set from the set of gray scale images through the feature extraction and feature selection. In the feature extraction stage, applied three spatial statistical operators such that mean, standard deviation and variance over the each individual block in the digital image and extracted three features from each block respectively. In the feature selection stage, the size of the image feature vector set sis limited.

In the classification stage, the trained image feature vector set is partitioned into k highly relative clusters through the k-means technique. We tested our proposed system over the 20 sample medical images and partitioned into three highly relative clusters or classes with high accuracy. According to the experiment results, our proposed system is better suitable for partitioned the images into highly relative classes for the image classification. In the feature work we will improve our system for classifying the new sample images that belong to which class or cluster in the existing classes.

REFERENCES

  1. RajputG.G, Patil and Preethi Detection and classification of

    exudates using k-means clustering in color retinal images Signal and Image Processing (ICSIP), 2014 Fifth International Conference 2014,pp126 130,2014.

  2. Venkatalakshmi.k Dept of I.T Classification of multispectral images using vector machines based on PSO and K-means clustering- Intelligent sensing and information processing, preceeding of 2005 International Conference on Date 4-7Jan 2005,pp 127-133, 2005.

    /li>

  3. Turker ince Unsupervised classification of polarimetric SAR image with dynamic clustering: An image processing approach Advances in Engineering Software,Science Direct Volume 41, Issue 4, pp 636- 646,2009.

  4. Gianluigi Ciocca, Claudio Cusano, Simone Santini, Raimondo Schettini On the use of supervised features for unsupervised image categorization: An evaluation Computer Vision and Image Understanding,Volume122,Issue null,pp155-171 (science digest ) 2013.

  5. Sanghoon Lee,Kyunggi-Do,Crawford,M.M Unsupervised multistage image classification using hierarchical clustering with a bayesian similarity measure Image Processing, IEEE Transactions Volume:14

    , Issue: 3 ,pp 312-320,2005.

  6. Genitha,C.H, Vani,K Classification of satellite images using new Fuzzy cluster centroid for unsupervised classification algorithm Information & Communication Technologies (ICT), 2013 IEEE Conference on 11-12 April 2013, pp 203 207,2013.

  7. Frank Y. Shih, Gwotsong P. Chen Classification of landsat remote sensing images by a fuzzy unsupervised clustering algorithm Information Sciences – Applications, Volume 1, Issue 2, pp 97-116,2013.

  8. Beaulieu, J.M.;Touzi,R. Mean-shift and hierarchical clustering for textured polarimetric SAR image segmentation/classification Geoscience and Remote Sensing Symposium (IGARSS), 2010 IEEE International Publication Year: 2010 , pp 2519 2522,2010.

  9. Kato, Z.Zerubia, J. Berthod, M Unsupervised parallel image classification using a hierarchical Markovian modelComputer Vision,. Proceedings, Fifth International Conference on 20-23 Jun 1995, pp169 174, 1995.

  10. Chardin, A. , Perez, P. Unsupervised image classification with a hierarchical EM algorithm Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on image processing Volume:2, pp 969 – 974 ,1999.

[11http://www.google.co.in/imgres?imgrefurl=http%3A%2F%2Fpaulbourk e.net%2Fmiscellaneous%2Fcortex%2F&tbnid=PTPNHg1rb9OTlM:&do cid=iN4bz-VS7JKb0M&h=388&w=600.

Leave a Reply