Implementation of Concept Based Image Search Algorithm

DOI : 10.17577/IJERTV2IS90942

Download Full-Text PDF Cite this Publication

Text Only Version

Implementation of Concept Based Image Search Algorithm

Rupali S. Chavan P.G.Student,

S. M. Patil Assistant Professor

Abstract

Abstract Concept based image search algorithm system is becoming increasingly important with the advancements in broadband networks, high-powered workstations etc. Large collections of images are becoming available to the public, from photo collection to web pages, or even video databases. Concept based image retrieval is searching specific image from image database. Concept based image retrieval is also called as description-based or text-based image retrieval. Retrieving the image from text-based indexing that may employ keyword, subject heading, caption, or natural language text. This metadata addition is done automatically via automatic indexing and manual entry by the user.

In Automatic indexing manually tagged image is used as input to tag new image, which is automatically tagged using image properties such as colour histogram, edges, shapes etc., and based search algorithm to compare given image with the unknown image using a threshold value and user need not understand the content of image in this case. Our concept based system will retrieve an image conceptually. Here user needs to understand only the concept of image not content of images. Whereas content of the image includes shapes, colours, texture etc.

In our system manual tagging is required for retrieving the image and automatic indexing is required for indexing the images from image database.

KeywordsMetadata, Concept Based Image Search, Automatic Indexing, Image Database.

  1. Introduction

    Since visual media requires large amounts of memory and computing power for processing and storage, there is a need to efficiently index and retrieve visual information from image database. In our system image is retrieved from large image database with the help of concept based system. Where concept based [1] system is adding a metadata to an image such as captioning and keywords or description.

    Here in the figure1.1 we can see that user enters text input and expects images of the same concept. It is not always possible to get image similar to the image we want to search hence it is better for any image retrieval system to take text input from the user.

    Figure 1.1 Expected system output

    As after first 90% images the result that any other search engine shows are drastically mismatching as it uses contents rather than concept of the images.

    Existing image retrieval and indexing is based on features which are contained in images, such as colour, texture and shape. The edges & histogram is an index mechanism which allows us to describe a content of images. In our system we use a tag in metadata for image retrieval. The indexing mechanism enables users to retrieve the images related with a query which formulated with their concept. Automatic indexing is possible for large data base of images by extracting the feature from their contents.

    Usually used in reference to a computer application, a text-based application is one whose primary input and output are based on text rather than graphics or sound. This does not mean that text-based applications do not have graphics or sound, just that the graphics or sounds are secondary input to the text.

    User refers text to retrieve an image not content of image. User will get more accuracy and speed is high with

    help of Concept-Based Image Search Algorithm. We need automatic indexing for images from large database. Automatic indexing is totally based on content of images.

    In opposition Concept Based search is without the ability to examine image content, searches must rely on metadata such as captions or keywords. Such metadata[4] must be generated by a human and stored alongside each image in the database. Because of that only Concept- Based Image Retrieval is not implemented. In content based image retrieval various techniques are used for extraction and representation of image features like histograms, local (corresponding to regions or sub-image) or global threshold, colour layouts, gradients, edges, contours, boundaries & regions, textures and shapes. Concept based image retrieval aims to find a description that best describe the images in one class and distinguish these images from all the other classes. Efficient indexing and retrieval of large number of colour images, classification plays an important and challenging role.

  2. Methodology

    1. Semantic Concept

      The concept-based image retrieval, which is also known as the semantic-based image retrieval, narrows down the semantic gap by incorporating machine learning techniques to establish a correlation between the extracted features and a set of high-level semantic categories. This process is also known as the automatic image annotation (AIA) in which a system learns a set of semantic categories so as to annotate new input instances with one or more of learned semantic categories . After the annotation process, users can query image objects using textual queries or keywords [1].

      Figure 2.1 Semantic Concept

      Figure 2.1 shows semantic concept [6] is divided into two groups such as Grouping and Entity such as divided into different categories. Automatic image annotation system learns a set of semantic categories.

      The concept-based image retrieval, which is also known as the text-based image retrieval, narrows down the semantic gap by incorporating machine learning techniques such as automatic indexing to establish a correlation between the automatic indexed images and user tagged images.

    2. Image and Metadata

      Image retrieval system is a computer system for browsing, searching and retrieving images from a large digital database. Most traditional and common methods of image retrieval utilize some method of adding metadata

      Figure 2.3 The Image and its Metadata

      such as captioning, keywords, or descriptions to the images so that retrieval can be performed over the annotation words.

      Visual image is a data structure characterised by its possession of certain physical attributes or primitive features [2], including size, colours, textures, shapes.

      Metadata is required for Concept-Based Image Retrieval. Without the ability to examine image content, searches must rely on metadata such as captions or keywords. Such metadata must be generated by a human and stored alongside each image in the database. Tagging is initially done via human concepts and further by analysing other images automatic tagging will happen, based on the contents of tagged and untagged images.

    3. Concept Based Methodology

      Figure 2.3 The concept based image retrieval model

      In the predominant paradigm of visual information retrieval, transactions are conducted with respect to the textual annotations within the metadata of an image collection. The process, usually known as concept based image retrieval and illustrated in Fig.2.3, involves a verbal expression of the query, possibly mediated by a thesaurus or classification scheme in order to couch the query in terms of a controlled, or authorised, vocabulary. The (modified) expression is then matched against the textual annotation associated with each image. Any matching expression (or one which matches sufficiently closely to satisfy some similarity threshold) results in the recovery of its associated image, which is then presented to the client for consideration. However, capturing in words the content or earning of an image is a significant intellectual challenge. Semantic analysis of an image ypically identifies more than one layer of meaning. The indexing methodology, while it may be appropriate to conceptualise the images as a set images database, there is the added need to represent the semantic continuity .Concept based image retrieval[5] is applied to the digital images.

    4. Content Based Methodology

      Figure 2.4 The content based image retrieval model

      The CBIR matching process, which is represented in Fig.2.4, is conducted on those image attributes of colour, texture and shape; the latter elaborated by spatial (or spatio-temporal) distribution, which are amenable to quantification and, thereby, automatic indexing. Since this process is conducted on unstructured arrays of pixel intensities, in contrast to the logically structured data (ASCII character strings) which populate text databases, CBIR at this level is said to have no parallel in text-based information retrieval.

      Automatic indexing is also possible on basis of the texture attribute. Most people know texture when they see image, though the concept is either difficult or impossible to define.

      Describe the texture as an innate property of virtually all surfaces, identified as visual patterns having properties of homogeneity that do not result from the presence of only a single colour or intensity. In the context of visual image retrieval, emphasis has been placed on computational

      approximations to a number of visually meaningful texture properties, among which coarseness, contrast and directionality have been shown in psychophysical studies to be of particular significance to the human visual system. Typically, these three texture features are computed from local neighbourhood analysis of each of an images pixels. One of the most potentially valuable approaches to automatic image retrieval by primitive feature involves shape analysis. Shape is generally defined in terms either of boundaries or regions .Various methods are used for automatic indexing such as based on shape, colour or texture. In our system colour histogram and shape detection is used for Automatic Indexing.

    5. Automatic indexing based on Colour histogram

      Figure 2.5 Colour Image and its Colour Histogram

      A colour histogram [3] is a representation of the distribution of colours in an image. For digital images, a colour histogram represents the number of pixels that have colours in each of a fixed list of colour ranges that span the image's colour space, the set of all possible colours.

      An explanation for this fact is that, after quantization into bins, no information about the colour space is used by the classifier. The number of bins per colour component has been fixed to 16. Some experiments with a smaller number of bins have been undertaken, but the best results have been reached with 16 bins. We have not tried to increase this number, because it is computationally too intensive. It is preferable to compute the histogram from the highest spatial resolution available. Sub sampling the image too much result in significant losses in performance. This may be explained by the fact that by sub sampling, the histogram loses its sharp peaks, as pixel colours turn into averages (aliasing).

      Properties of Colour Histogram

      -Invariant to translation and rotation.

      -Changes slowly under change of angle of view, change in scale and occlusion.

      -Depends on lighting condition.

      Concept detectors for patent images combining visual and textual information using supervised machine learning and image analysis techniques. It seems that visual based classification can work complementarily to text classification results, but it can still have an acceptable performance in cases where the textual description is not available or incomplete. For instance, there are many patent documents where the textual descriptions cannot automatically be assigned to the correct figures or they cannot be automatically translated when they are written in certain foreign languages. The image processing approaches require prior segmentation of the images. Therefore, either automatic segmentation techniques could be applied, introducing, however, an error of around 20%, or manual segmentation, which is expensive in terms of time and human effort, could be performed. Another requirement of this method is that it needs a training set and a manual selection of concepts, while for each new concept introduced, there is a need to have manually annotated images by experts. The aforementioned constraints could be considered as significant obstacles for the scalability of the proposed method. To overcome these constraints, the following approaches can be considered. First, the drawings page segmentation to figures could be performed by applying automatic approaches and further improve the segmentation performance by considering supervised techniques [7].

  3. Implementation

    Figure 3.1 Automatic Indexing and Concept Based Image Retrieval

    In concept based retrieval system we will use automatic indexing as well as manually tagging.[8] Feature extraction is based shape, colour or texture which will be possible by automatic indexing. Image is retrieve by using text annotation by giving text as an input.

    Figure 3.1 shows automatic indexing as well concept based image retrieval. Automatic indexing is possible based on feature extraction through retrieval engine. Feature extraction is the basis of content-based image retrieval. This involves extraction of the image features at

    a distinguishable extent. For colour based classification, colour can also be represented by numerous of ways. Most commonly used colour descriptors are: colour moments, colour histograms, colour coherence vector, colour correlogram. Image retrieval systems rely on text annotation of pictures as the basis to index and retrieve image data.

      1. Flow graph for Automatic Indexing

        Figure3.2 Algorithm of Automatic Indexing

        Steps:-

        1. Load reference image from the drive.

        2. Retrieve the metadata from the loaded image.

        3. Extract the features of reference image.

        4. Load the new image from the drive.

        5. Extract the feature from the image load in step 4.

        6. If the feature matches automatically tag the image with the original image tag extracted in step 2.

        7. Check for all images read or not go to step 3.

      2. Software Implementation for Concept Based Image Search

    3.3 Algorithm of Concept based Image Retrieval

    Steps:-

    1. Take input as a Text from user.

    2. Load the image from database.

    3. Retrieve the metadata of read image in step 2.

    4. Search for query in metadata if found display image at the output.

    5. All images being read else go to step 2.

  4. Result

    Number of

    Letter

    Keyword or Letter

    No. Of Images

    Time

    01

    B

    64

    257.06991

    D

    62

    247.844172

    Mean

    63

    252.457041

    02

    Bo

    23

    157.293084

    Do

    35

    178.19373

    Mean

    29

    167.743407

    03

    Boo

    10

    143.834867

    Doo

    17

    133.590296

    Mean

    13.5

    138.7125815

    04

    Book

    10

    110.489394

    Door

    17

    146.07223

    Mean

    13.5

    <>128.280812

    03

    Car

    5

    137.66742

    Red

    9

    149.546615

    Boo

    10

    143.834867

    Doo

    17

    133.590296

    Mean

    10.25

    141.1597995

    04

    Book

    10

    110.489394

    Door

    17

    146.07223

    Tree

    38

    172.577884

    Mean

    21.66666667

    143.0465027

    Average deviation

    14.24716553

    30.64040738

    Number of

    Letter

    Keyword or Letter

    No. Of Images

    Time

    01

    B

    64

    257.06991

    D

    62

    247.844172

    Mean

    63

    252.457041

    02

    Bo

    23

    157.293084

    Do

    35

    178.19373

    Mean

    29

    167.743407

    03

    Boo

    10

    143.834867

    Doo

    17

    133.590296

    Mean

    13.5

    138.7125815

    04

    Book

    10

    110.489394

    Door

    17

    146.07223

    Mean

    13.5

    128.280812

    03

    Car

    5

    137.66742

    Red

    9

    149.546615

    Boo

    10

    143.834867

    Doo

    17

    133.590296

    Mean

    10.25

    141.1597995

    04

    Book

    10

    110.489394

    Door

    17

    146.07223

    Tree

    38

    172.577884

    Mean

    21.66666667

    143.0465027

    Average deviation

    14.24716553

    30.64040738

    Table 1 showing number of letters and time consumed with respect to number of images retrieved

    Figure4.1 Output of implemented system for the search word car

    User enter car word in our implemented algorithm, above figure shows output of car word. Left top image has very small part of car and other three image are showing desired result. In all four images were retrieved from the data base of 111 images.

    Graph 1

    70

    Number of images

    Number of images

    60

    50

    40

    30

    20

    10 y = -26.37x + 86.83

    R² = 0.972

    0

    0 1 2 3 4

    Number of letters

    Figure 4.2: Graph shows relation between number of number of letter and images

    Graph 2

    300

    250

    200

    150

    100

    50

    0

    y = -55.64x + 298.4

    R² = 0.916

    300

    250

    200

    150

    100

    50

    0

    y = -55.64x + 298.4

    R² = 0.916

    0 1 2 3 4

    Number of letter

    0 1 2 3 4

    Number of letter

    Time in sec.

    Time in sec.

    Figure4.3: Graph shows relation between Number of letter and Time in sec.

  5. Conclusion

    Concept based image retrieval aims to find a description that best describe the images in one class and distinguish these images from all the other classes. Efficient indexing and retrieval of large number of colour images, classification plays an important and challenging role.

    From an application point of view it should be said that the concept retrieval module could be a part of a larger retrieval framework, which already includes functionalities such as full text and semantic search for the large image database.

    User retrieve an image from image database, accuracy should be more and speed should be high.

  6. References

  1. Jarrar, R., Belkhatir, M., & Messom, C. H. (2011). On the use of conceptual information in a concept-based image indexing and retrieval framework. 2011 18th IEEE International Conference on Image Processing, 24412444.

  2. Singh, S. M., & Hemachandran, K. (2012). Content- Content

    – Based Image Retrieval using Color Moment and Gabor Texture Feature, 9(5), 299309.

  3. Afifi, A. J., & Ashour, W. M. (2012). Content-Based Image Retrieval Using Invariant Color and Texture Features. 2012 International Conference on Digital Image Computing Techniques and Applications (DICTA), 16.

    doi:10.1109/DICTA.2012.6411665

  4. Anantharatnasamy, P., Sriskandaraja, K., Nandakumar, V., & Deegalla, S. (2013). Fusion of colour, shape and texture features for content based image retrieval. 2013 8th International Conference on Computer Science & Education, (Iccse), 422 427. doi:10.1109/ICCSE.2013.6553949.

  5. Choi, J. H., Park, S. H., & Park, S. J. (2004). Design and Implementation of a Concept-based Image Retrieval System with Edge Description Templates, 5307, 571581.

  6. Jarrar, R., & Belkhatir, M. (2010). Proceedings of 2010 IEEE 17th International Conference on Image Processing TOWARDS AUTOMATED CONCEPTUAL SHAPE-BASED CHARACTERIZATION AN APPLICATION TO SYMBOLIC IMAGE RETRIEVAL Faculty of Information Technology , Monash University University of Lyon & CNRS-France, 2673 2676.

  7. Zhang, L., Member, S., Wang, L., Member, S., & Lin, W. (2013). Collaborative Image Retrieval, (c), 114.

  8. Datta, R., Joshi, D., Li, J. I. A., & Wang, J. Z. (2008). Image Retrieval : Ideas , Influences , and Trends of the New Age, 40(2), 160. doi:10.1145/1348246.1348248

Leave a Reply