A Robust Image Representation and Learning Method using SVM for Image Retrieval

DOI : 10.17577/IJERTV6IS030204

Download Full-Text PDF Cite this Publication

Text Only Version

A Robust Image Representation and Learning Method using SVM for Image Retrieval

Shoaib Masroor .TS, Prof. Arvind Kumar Sharma

Depatment of Computer Science OPJS University, Rajasthan, India.

AbstractMultimedia applications are increasing rapidly and a large number of digital images are stored in database. For the effective retrieval of the desired images from a huge image database study of a content-based image retrieval (CBIR) technique has become an important research issue. In this proposed work, image retrieval is done through color and texture feature extraction. For feature extraction different algorithms are used like color auto correlogram, HSV (Hue Saturation Value) histogram and color moments. Also, texture features like mean square energy, mean amplitude of 2D wavelet component and standard deviation of wavelet coefficients. Features of the query image and the database images are classified and compared using support vector machine and similarity measures. Features are compared based on pair wise euclidean distance between query image and database image by various methods such as L1, cityblock, minkowski, chebychev, cosine, correlation and spearman.

KeywordsCBIR; Active Learning ; HSV; Similarity Matching; Feature Extraction

Image Retrieval

Text based method

Content based method

Fig.1 Image Retrieval approaches

  1. PROPOSED WORK

    1. Block diagram of proposed work

      In this paper, content based image retrieval system is implemented. Block diagram of proposed CBIR system is shown in figure 2.

      1. INTRODUCTION

        An image retrieval system is a process, which allows a user to search, browse and retrieve the images. When image is retrieved based on its contents from a huge database then it is called as content based image retrieval (CBIR).Nowadays, a large number of digital images are generated and transmitted over the Internet. So, when we search for an image on the web by using the Google image search, it does not retrieve the desired image as per user need. This is because searching of digital images by Google is based on keywords, text etc. [3].

        For image retrieval, we can use some local and global features such as a point, edge, or small image patch, color, texture, shape [2]. CBIR systems work with all types of image formats and the search is based on the intensive comparison of features with the query image. The color feature is one of the most widely used visual features in the image retrieval [10].CBIR technique have its applications in the field of Military, Education, Biomedicine, Web Image Classification and Searching. Three methods of image retrieval are text-based method, content-based method and hybrid method.

        Fig.2 Block diagram of proposed scheme

        Database Images: Large numbers of digital images are stored to retrieve desired image or images on the basis of given query image. These images can be stored in the hard disk or database.

        Query Image: Desired image which is to be retrieved from a database.

        CBIR consists of two phases: Feature Extraction and Feature Matching.

        1. Feature Extraction

          This phase involves extracting features of query image and database images. In proposed work, color based and texture based features are extracted. Color based feature extraction process is done by using three algorithms: Color auto-correlogram, HSV (Hue Saturation Value) histogram and Color moments. Texture based features are: Gabor features like 'mean square energy', mean amplitude of 2D wavelet components, mean and standard deviation of wavelet coefficients. From this phase, we get two feature vectors one for the query image and another for database image.

        2. Feature Matching

        In this phase, features of each image stored in the database are compared to the features of the query image. SVM (Support Vector Machine) is employed for classification of images based on their features. Also, the similarity between the two images is measured using euclidean distance by various methods like cityblock, minkowski, chebychev, cosine, correlation and spearman. If the distance between a feature vector of the database image and the query image is small, then that image in the database is considered as a matched image to the query image. Then, the matched images are ranked accordingly to a similarity index. Finally, images with high similarity are retrieved.

    2. Support Vector Machine

      In machine learning, support vector machines are supervised learning models which are associated learning algorithm. It analyzes data which is used for classification and regression analysis. Given a set of training examples, each data marked to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, which is not present in any one of the two classes making it a non-probabilistic binary linear classifier.SVM classification involves identification of features which are related to the known classes. This is known as feature extraction. Classifier simplifies the image retrieval system to retrieve more relevant images from the database [8].

      Confusion Matrix: A confusion matrix contains information about classifications such as actual and predicted classifications. The Performance of such systems is evaluated from the data in the matrix. The following table shows the confusion matrix for a two class classifier [8].

      Fig.3 Confusion Matrix

      • a is the number of correct predictions when an instance is negative,

      • b is the number of incorrect predictions when an instance is positive,

      • c is the number of incorrect of predictions when an instance negative, and

      • d is the number of correct predictions when an instance is positive.

    3. Algorithm

    1. Select folder which contains images for database for training.

    2. Extract features of all the images stored in the database.

    3. Load this features of a database into the workspace.

    4. Select query image.

    5. Resize image into m by n resolution.

    6. For color features extraction convert an image from RGB to HSV. Quantize image hue, saturation and value plane into 8, 2, 2 discrete levels to get feature vector of size 1 by 32.

    7. Find color auto-correlogram to find a relation between color and texture corresponding to that image.

    8. Calculate the mean and standard deviation of R, G, and B plane to find their color moment and form their feature vector.

    9. Calculate Gabor features in the form of Mean-squared energy & mean amplitude for each image scale and orientation.

    10. Form the feature vector of all above extracted features.

    11. Compare features of features of each image stored in feature database with query image based on pair wise euclidean distance by various methods like L1, city block, minkowski, chebychev, cosine, correlation, and spearman.

    12. Based on the count selected by user pop out images on output window which are closely matching to query image.

    13. Apply SVM classifier to classify the features with more accuracy to get most closely match with query image.

    14. Go to step 4.

  2. STATISTICAL CONSIDERATIONS

    In proposed approach, various statistical features are considered for the similarity matching between the query image and the database images. Following are the pair wise distance calculation method which calculates the distance between the to images. Given an mx-by-n data matrix X, which is treated as mx (1-by-n) row vectors x1, x2… xmx, and my-by-n data matrix Y, which is treated as my (1-by-n) row vectors y1, y2… ymy, the various distances between the vector xs and yt are defined as follows:

    1. Euclidean distance

      . (1)

      Euclidean distance is a special case of the Minkowski metric, where p=2.

    2. Standardized Euclidean distance

      . (2)

      Where, V is the n-by-n diagonal matrix whose jth diagonal element is S (j) 2, where S is the vector of standard deviations.

    3. Mahalanobis distance

      …… (3)

      Where C is the covariance matrix.

    4. City block metric

      .. (4)

  3. RESULTS AND ANALYSIS

    The similarity matching and feature extraction algorithms methods are coded using MATLAB on Intel(R) Core(TM) i5 CPU with 4GB RAM running on Windows7.

    City block distance is a special case of the Minkowski metric, where p=1.

    1. Minkowski metric

      .. (5)

      For the special case of p = 1, the Minkowski metric gives the City Block metric, for the special case of p = 2, the Minkowski metric gives the Euclidean distance, and for the special case of p=, the Minkowski metric gives the Chebychev distance.

    2. Chebychev distance

      …. (6)

      Chebychev distance is a special case of the Minkowski metric, where p=.

    3. Cosine distance

      . (7)

    4. Correlation distance

      The database consists of 500 images which have been manually selected to be a database of 5 categories of 100 images each. The 5 categories of images are Africa, Beach, Monuments, Buses and Dinosaurs.

      1. GUI of the Image Retrieval System

        Figure 5 shows the GUI of the Image Retrieval Systems.First database is trained and then images are retrived by applying any one of the comparison method.Here,GUI shows retrieved images based on the L1 method of comparison.It also shows expected and actual retrieved images.

      2. Performance Evaluation

        To evaluate the performance of each Feature Extraction algorithm the following steps will be carried out

        1. Select five images from each category as a query images.

        2. 10 images with high similarity are retrieved.

        3. Find confusion matrix. From confusion matrix we will get accuracy.

    Where

    and

    … (8)

    Fig.4 SVM Result for bus image

    Fig.5 Retrieval of monument images based on L1 method

    Fig.6 Retrieval of bus images based on L1 method

    120

    100

    80

    60

    40

    20

    0

    Table 1.

    Comparative study of different techniques

    Comparison Technique

    Retrieved images out of 10

    Spearman

    10

    L1

    10

    Cityblock

    10

    Minkowski

    10

    Chebychev

    10

    A

    c c u r a c y

    Fig..7 Accuracy graph based on L1 method

  4. CONCLUSION

In this research, we presented a content-based image retrieval system based on feature extraction. We use Gabor filter, which is a powerful texture extraction technique either in describing the content of image regions or the global content of an image. For feature extraction various color based and texture features are used. Also, for classification of images SVM (Support Vector Machine) is employed. From confusion matrix of SVM we get an accuracy of detecting the

query image from the given database. Statistical measures are used for comparison between the images. It shows that expected retrieved images and actual retrieved images are same for different methods like L1, cityblock, minkowski, chebychev, cosine, correlation and spearman. In proposed paper, we have shown results for L1 method. Accuracy graph shows accuracy for different images based on L1 method of comparison.

REFERENCES

  1. Aly S. Abdelrahim, Mostafa A. Abdelrahman, Ali ahmoud and Aly

    1. Farag, Image Retrieval Based on Content and Image Compression, Department ofElectrical and Computer Engineering, University ofLouisville,Louisville, KY, 40292. USA, 978-1-61284-774 0/11, © 2011 IEEE.

  2. Aman Chadha, Sushmit Mallik & Ravdeep Johar Comparative Study and Optimization of Feature-Extraction Techniques for Content based Image, International Journal of Computer Applications (0975 8887) Volume 52No.20., August 2012.

  3. Hany Fathy Atlam,Gamal Attiya , Nawal El-Fishawy,"Comparative Study on CBIR based on Color Feature",International Journal of Computer Applications (0975 8887) ,Volume 78 No.16, September 2013.

  4. Ashok Kumar & J. Esther Comparative Study on CBIR based by Color Histogram, Gabor and Wavelet Transform, International Journal of Computer Applications (0975 8887) Volume 17 No.3, March 2011.

  5. Chesti Altaff Hussain, Dr. D. Venkata Rao, T. Praveen Color Histogram Based Image Retrieval, International Journal of Advanced Engineering Technology, E-ISSN 0976-3945, Int J Adv Engg Tech/IV/III/July- Sept.,2013/63.

  6. Chiou-Yann Tsai, Arbee L.P. Chen & Kai Essig Efficient Image Retrieval Approaches for Different Similarity Requirements, Department of Computer Science National Tsing Hua University, Hsinchu, Taiwan 300, R.O.C,2013.

  7. John Eakins & Margaret Graham,Content-based Image Retrieval, JISC Technology Applications Programme. University of Northumbria at Newcastle January, 1999.

  8. Amit Singla, Meenakshi Garg ,CBIR Approach Based On Combined HSV, Auto Correlogram, Color Moments and Gabor Wavelet, International Journal Of Engineering And Computer Science ISSN:2319-7242,Volume 3, Issue 10, Page No. 9007-9012, October, 2014.

  9. Matei Dobrescu, Manuela Stoian & Cosmin Leoveanu,Multi-Modal CBIR Algorithm based on Latent Semantic Indexing,Fifth International Conference on Internet and Web Applications and Services,2010.

  10. Gaurav Jaswal, Amit Kaul & Rajan Parmarn ,Content based Image Retrieval using Color Space Approaches International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 8958, Volume-2, Issue-1, October 2012.

  11. Arthi .k & Vijayaraghavan Content Based Image Retrieval Algorithm Using Color Models, International Journal of Advanced Research in Computer and Communication Engineering Vol. 2, Issue 3, March 2013.

.

Leave a Reply