Sketch4Match – Content- based Image Retrieval System Using Sketches

DOI : 10.17577/IJERTV1IS7524

Download Full-Text PDF Cite this Publication

Text Only Version

Sketch4Match – Content- based Image Retrieval System Using Sketches

A.Sravanthi , B.Harishwar Reddy

Department of Computer Science and Engineering, PRRM Engineering College, Hyderabad.

Department of Computer Science and Engineering, PRRM Engineering College, Hyderabad.


The content based image retrieval (CBIR) is one of the most popular, rising research areas of the digital image processing.

In these tools, images are manually annotated with keywords and then retrieved using text-based search methods. The goal of CBIR is to extract visual content of an image automatically, like color, texture, or shape. This paper aims to introduce the problems and challenges concerned with the design and the creation of CBIR systems, which is based on a free hand sketch (Sketch based image retrieval SBIR). With the help of the existing methods, revealed that the proposed algorithm is better than the existing algorithms, which can handle the informational gap between a sketch and a colored image. Overall, the results show that the sketch based system allows users an intuitive access to search-tools.

Index Terms k-means algorithm , Pruning Top K algorithm, Image Similarity matching algorithm.


Before the spreading of information technology a huge number of data had to be managed, processed and stored. It was also textual and visual information. Parallelly of the appearance and quick evolution of computers an increasing measure of data had to be managed. The growing of data storages and revolution of internet had changed the world. The efficiency of searching in information set is a very important point of view. In case of texts we can search flexibly using keywords, but if we use images, we cannot apply dynamic methods. Two questions can come up. The first is who yields the keywords. And the second is an image can be well represented by keywords.

In many cases if we want to search efficiently some data have to be recalled. The human is able to recall visual information more easily using for example the shape of an object, or arrangement of colors and objects. Since the invariant opposite rotation, scaling and translation.

Lately the development of difficult and robust

human is visual type, we look for images using other images, and follow this approach also at the categorizing. In this case we search using some features of images, and these features are the keywords. At this moment unfortunately there are not frequently used retrieval systems, which retrieve images using the non-textual information of a sample image. What can be the reason? One reason may be that the text is a human abstraction of the image. To give some unique and identifiable information to a text is not too difficult. At the images the huge number of data and the management of those cause the problem. The processing space is enormous.

Our purpose is to develop a content based image retrieval system, which can retrieve using sketches in frequently used databases. The user has a drawing area where he can draw those sketches, which are the base of the retrieval method.

Using a sketch based system can be very important and efficient in many areas of the life. In some cases we can recall our minds with the help of figures or drawing. In the following paragraph some application possibilities are analyzed.

The CBIR systems have a big significance in the criminal investigation. The identification of unsubstantial images, tattoos and graffities can be supported by these systems. Similar applications are implemented in [9], [10], [11].

Another possible application area of sketch based information retrieval is the searching of analog circuit graphs from a big database [7]. The user has to make a sketch of the analog circuit, and the system can provide many similar circuits from the database.

The Sketch-based image retrieval (SBIR) was introduced in QBIC [6] and Visual SEEK [17] systems. In these systems the user draws color sketches and blobs on the drawing area. The images were divided into grids, and the color and texture features were determined in these grids. The applications of grids were also used in other algorithms, for example in the edge histogram descriptor (EHD) method [4]. The disadvantage of these methods is

descriptors was emphasized. Another research approach is the application of fuzzy logic or neural

networks. In these cases the purpose of the investment is the determination of suitable weights of image features [15].


In earlier days, image retrieving from large image database can be done by following ways. We will discuss briefly about the image retrieving of various steps

  1. Automatic Image Annotation and Retrieval using Cross Media Relevance Models

  2. Concept Based Query Expansion

  3. Query System Bridging The Semantic Gap For Large Image Databases

  4. Ontology-Based Query Expansion Widget for information Retrieval

  5. Detecting image purpose in World-Wide Web documents

    Automatic Image Annotation and Retrieval using Cross Media Relevance Models

    Libraries have traditionally used manual image annotation for indexing and then later retrieving their image collections. However, manual image annotation is an expensive and labor intensive procedure and hence there has been great interest in coming up with automatic ways to retrieve images based on content [5]. Here, we propose an automatic approach to annotating and retrieving images based on a training set of images.

    We assume that regions in an image can be described using a small vocabulary of blobs. Blobs are generated from image features using clustering. Given a training set of images with annotations, we show that probabilistic models allow us to predict the probability of generating a word given the blobs in an image. This may be used to automatically annotate and retrieve images given a word as a query. We show that relevance models. Allow us to derive these probabilities in a natural way. Experiments show that the annotation performance of this cross-media relevance model is almost six times as good (in terms of mean precision) than a model based on word-blob co-occurrence model and twice as good as a state of the art model derived from machine translation. Our approach shows the usefulness of using formal information retrieval models for the task of image annotation and retrieval.

    Concept Based Query Expansion

    Query expansion methods have been studied for a long time – with debatable success in many instances. In this project we present a probabilistic query expansion model based on a similarity thesaurus which was constructed automatically [6]. A similarity thesaurus reflects domain knowledge about the particular collection from which it is constructed. We

    address the two important issues with query expansion: the selection and the weighting of additional search terms. In contrast to earlier methods, our queries are expanded by adding those terms that are most similar to the concept of the query, rather than selecting terms that are similar to the query terms.

    Our experiments show that this kind of query expansion results in a notable improvement in the retrieval effectiveness when measured using both recall-precision and usefulness.

    Query System Bridging the Semantic Gap for Large Image Databases

    We propose a novel system called HISA for organizing very large image databases. HISA implements the first known data structure to capture both the ontological knowledge and visual features for effective and efficent retrieval of images by either keywords, image examples, or both. HISA employs automatic image annotation technique, ontology analysis and statistical analysis of domain knowledge to precompile the data structure [7]. Using these techniques, HISA is able to bridge the gap between the image semantics and the visual features, therefore providing more user-friendly and high performance queries. We demonstrate the novel data structure employed by HISA, the query algorithms, and the pre-computation process.

    Ontology-Based Query Expansion Widget for information Retrieval

    In this project we present an ontology-based query expansion widget which utilizes the ontologies published in the ONKI Ontology Service. The widget can be integrated into a web page, e.g. a search system of a museum catalogue, enhancing the page by providing query expansion functionality. We have tested the system with general, domain specific and spatiotemporal ontologies.[8]

    Detecting image purpose in World-Wide Web documents

    The number of World-Wide Web (WWW) documents available to users of the Internet is growing at an incredible rate. Therefore, it is becoming increasingly important to develop systems that aid users in searching, altering, and retrieving information from the Internet. Currently, only a few prototype systems catalog and index images in Web documents. To greatly improve the cataloging and indexing of images on the Web, we have developed a prototype rule-based system that detects the content images in Web documents [9]. Content images are images that are associated with the main content of Web documents, as opposed to a multitude of other images that exist in Web documents for different purposes, such as decorative, advertisement and logo images. We present a system that uses decision tree

    learning for automated rule induction for the content image detection system. The system uses visual features, text-related features and the document context of images in concert for fast and effective content image detection in Web documents.

    Content Based Image Retrieval

    Content Based Image Retrieval (CBIR) [10] is an automatic process to search relevant images based on user input. The input could be parameters, sketches or example images. A typical CBIR process first extracts the image features and store them efficiently. Then it compares with images from the database and returns the results.

    Feature extraction and similarity measure are very dependent on the features used. In each feature, there would be more than one representation. Among these representations, histogram is the most commonly used technique to describe features.

    Fig. 1 Flow of a typical CBIR process

    Fig 2.1 describes the flow of a typical CBIR process although content based methods are efficient, they cannot always match users expectation. Relevance Feedback (RF) techniques are used to adjust the query by users feedback. RF is an interactive process to improve the retrieval accuracy by a few iterations. RF algorithms are dependent on feature representations, in this chapter, RF process and its histogram weighting method will be introduced.

    Algorithm 1:PRUNINGTOPK (s,Qt) Srel := Pruning(s,Qt)

    For each relevant leaf node si in Srel Do Compute Rel Deg(si |Qt)

    Sort the leaf nodes by their RelDeg Return the top-K of the sorted node.


    Fig. 2 The main PruningTopK Algorithm [1]

    If the output of the above Pruning/Pruning Top K algorithm is empty, the subsequent matching algorithm needs to search all the images indexed by the root node. In this case, the query scheme degrades to sequential scan. Otherwise, we refer to the leaf nodes being returned, and search in their respective ASDs for candidates using the image similarity matching algorithm. After the pruning procedure, the search space is reduced in orders of magnitude. We note that the Pruning Top K algorithm can be used to support progressive processing and displaying of query results, which is a desirable feature for online CBIR systems.



    The environment used for the experiments include Java Programming Language (Jdk 1.6), Eclipse 3.3, a PC with 2GB RAM. The SWING API of Java is used to build graphical user interface.

    Original input image

    Fig. 3 Image Indexing

    Initially the home page of indexing will be opened.Here the images are indexed from selected directories and also from subdirectories.If you dont specify any directory it will be selected from flicker.Then click on start button.

    Fig. 4search for digital images.

    We need to go to search button after the indexing has completed.Then double click on a row with in the search results starts a new search for clicked image.Use drag and drop to select images from explorer.

    Fig. 5 Sketch images which was used at the tests.

    Fig. 6 Search results

    Here it displays the relevant images based on image features and user inputs.

    Fig. 7. Some sample images of the Microsoft Research Cambridge Object Recognition Image Database.

    The system was tested with more than one sample database to obtain a more extensive description of its positive and negative properties. The Microsoft Research Cambridge Object .

    Fig. 8. Some sample images of Flickr 160 database.

    Fig. 9 shows the performance of the pruning algorithm

    The runtime query performance of the pruning algorithm, the resulting output image processing speed is depend upon the size of the image and the database.


    Among the objectives of this paper performed to design, implement and test a sketch-based image retrieval system. Two main aspects were taken into account. The retrieval process has to be unconventional and highly interactive. The robustness of the method is essential in some degree of noise, which might also be in case of simple images. The drawn image without modification cannot be compared with color image, or its edge representation. Alternatively a distance transform step was introduced. The simple smoothing and edge detection based method was improved, which had a similar importance as the previous step.

    At the tests the effectiveness of EHD and the dynamically parameterized HOG implementation was compared. It was examined with more databases. In our experience the HOG in more cases was much better than the EHD based retrieval. However, the situation is not so simple. The edge histogram descriptor can mainly look better for information poor sketches, while in other case better results can be achieved for more detailed. This is due to the sliding window solution of HOG. Using the SIFT- based multi-level solution the search result list is refined. With the categorization of retrieval response a bigger decision possibility was given to the user on that way, he can choose from more groups of results.


    1. Multimedia Content Description Interface Part 3: Visual, ISO/IEC

      JTC1/SC29/WG11/N4062, 2008.

    2. J. Wang and G. Wiederhold, SIMPLIcity: Semantics- Sensitive Integrated Matching for Picture Libraries, IEEE Transactions On Pattern Analysis And Machine Intelligence, vol. 23, no. 8, pp. 1-17, September 2008.

    3. M. Swain and D. Ballard, Color indexing, International Journal of Computer Vision, vol.. 7, no.1, pp. 11-32, 2008.

    4. K. Hirata and T. Kato, "Query by visual example, content based image retrieval," Advances in Database Technology-EDBT'92, vol. 580, pp. 56-71, A. Pirotte, C.Delobel, and G. Gottlob, Eds., 2006, Springer-Verlag.

    5. J. Jeon, V. Lavrenko, and R. Manmatha, Automatic Image Annotation and Retrieval Using Cross-Media Relevance Models, Proc. 26th Ann. Intl ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR 03), pp. 119-126, 2003.

    6. J. Kalervo, K. Jaana, and N. Timo, ExpansionTool: Concept-Based

      Query Expansion and Construction, Information Retrieval, vol. 4, no. , pp. 231-255, 2001.

    7. G. Chen, X. Li, L. Shou, J. Dong, and C. Chen, HISA: A Query System Bridging the Semantic Gap for Large Image Databases (Demo), Proc. 32nd Intl Conf. Very Large Data Bases (VLDB 06), pp. 1187-1190, 2006.

    8. X.Y. Li, L.D. Shou, G. Chen, and K.-L. Tan, An Image-Semantic Ontological Framework for Large Image Databases (Poster), Proc. 12th Intl Conf. Database Systems for Advanced Applications (DASFAA 07), pp. 1050-1053, 2007.

    9. J.R. Paek and S. Smith, Detecting Image Purpose in World-Wide Web Documents, Proc. IS&T/SPIE Symp. Electronic Imaging: Science and TechnologyDocument Recognition, vol. 3305, pp. 151- 158, Jan. 1998.

    10. A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, Content-Based Image Retrieval at the End of the Early Years, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-1380, Dec. 2000.


A.sravanthi Received B. Tech (CSE) degree in Computer science and Engineering from Vivekananda Institute Of Engineering College (affiliated to JNTU Hyderabad), karimnagar. She is pursuing M.Tech in Computer Science and Engineering in PRRM Engineering College (affiliated JNTU Hyderabad), Hyderabad.

B.harishwar Reddy, Studying M.Tech (CSE) from PRRM Engineering College, Hyderabad. He has completed

B. Tech (CSE) form Al- Habeeb Engineering College, Hyderabad in 2009.

Leave a Reply