A Novel Personalized Image Search Framework

DOI : 10.17577/IJERTV1IS10013

Download Full-Text PDF Cite this Publication

Text Only Version

A Novel Personalized Image Search Framework

Naeem Naik *, Prof. L. M. R. J. Lobo **, Riyaz Jamadar***

*( Department of Computer Science, WIT, Solapur University, Solapur

** (Department of Information Technology, WIT, Solapur University, Solapur

*** (Department of Computer Science, AISSMS IOIT, Pune


The present research aims at optimizing the image search time of websites for the users by using their personal data. The social media sites, such as Flicker and del.icio.us, allow users to upload content and annotate it with descriptive labels known as tags, join special-interest groups, etc. The large-scale user generated meta-data not only facilitate users in sharing and organizing multimedia content, but provide useful information to improve media retrieval and management. Personalized search serves as one of such examples where the web search experience is improved by generating the returned list according to the modified user search intents. User-generated metadata expresses users tastes and interests and is used to personalize information to an individual user. Specifically, a machine learning method that analyzes a corpus of tagged content to find hidden topics. It then uses these learned topics to select content that matches a users interests and it empirically validated this approach on the social photo-sharing site Flickr, which allows users to annotate images with freely selected tags and to search for images labeled with a particular tag. Metadata associated with images tagged is used with an ambiguous query term to identify topics corresponding to different senses of the term, and then personalize results of image search by displaying to the user only those images which are of interest to the user.

  1. Introduction

    The rise of the Social Web underscores a fundamental transformation of the Web. Rather than simply searching for, and passively consuming, information, users of blogs, wikis and social media sites like delicious, Flickr and digg, are creating, evaluating, and distributing information. In the process of using these sites, users are generating not only content that could be of interest to other users, but also a large quantity of metadata in the form of tags and ratings, which can be used to improve Web search and personalization [1].

    Web personalization refers to the process of customizing Web experience to an individual user. Personalization is used by online stores to recommend relevant products to a particular user and to customize a users shopping experience. It is used by advertising firms to target ads to a particular user. Search personalization has also been studied as a way to improve the quality of Web search by disambiguating query terms based on users browsing history or by eliminating irrelevant documents from search results.

    Personalizing image search is an especially challenging problem, because, unlike documents, images generally contain little text that can be used for disambiguating terms. Consider, for example, a user searching for photos of jaguars., Should the system return images of luxury cars or spotted felines to the user? In this context, personalization can help disambiguate query keywords used in image search or to weed out irrelevant images from search results. Therefore, if a user is interested in wildlife, the system will show her images of the predatory cat of South America and not of an automobile.

  2. Literature Review

    Traditionally, personalization techniques fall in one of two categories: collaborative-filtering or profile based. The first, collaborative filtering (Breese, 1998; Schafer, 2007) [2], aggregates opinions of many users to recommend new items to like-minded users. In these systems, users are asked to rate items on a universal scale. The system then analyses ratings from many users to identify those sharing similar opinions about items and recommends new items that these users liked. Netflix uses collaborative filtering to recommend movies to its subscribers.

    Amazon uses a similar technology to display other products that users who purchased a given product were also interested in. Since users are asked to rate items on a universal scale, the questions of how to

    design the rating system and how to elicit high quality ratings from users are very important. Despite the early concern that users lack incentives for making recommendations and, therefore, will be reluctant to make the extra effort, there is new evidence (Golder et al, 2006) [3] that this does not appear to be the case. It appears that, at the very least, users find value in a collaborative rating system as an extension of their memory.

    The second class of personalization systems uses a profile of user's interests to target items for user's attention. The profile can be created explicitly by the user, or mined from data about users behavior. Examples of the latter include data about users Web browsing and purchasing (Agrawal, 1994) [1] behavior. One problem with this approach is that it is time- consuming for users to keep their explicit profiles current. Another problem is that while data mining methods have proven effective and commercially successful, in most cases they use proprietary data, which is not easily accessible to researchers.

    Machine learning has played an increasingly important role in personalization. (Jin, R, 2006) [4] proposed a probabilistic generative model that describes co-occurrences of users and items of interest. In particular, the model assumes a user generates her topics of interest; then the topics generate documents and words in those documents if the user prefers those documents. The author-topic model (M. J. Carman, 2008) [5] is also used to find latent topics in a collection of documents and group documents according to topic. If a user prefers one document (or topic), this method can be used to recommend other relevant documents. These models, however, do not carry any information about individual users, their tastes and interests. However, a recent work this area described a mixture model for collaborative filtering that takes into account users' intrinsic preferences about items (Jin, 2006) [4]. In this model, item rating is generated from both the item type and user's individual preference for that type. Intuitively, like-minded users provide similar ratings on similar types of items (e.g., movie genres). When predicting a rating of an item for a certain user, the user's previous ratings on other items will be used to infer a like-minded group of users, and then the common rating of that group is used in the prediction. This type of model can conceivably be adapted to social metadata and be used to personalize results of image search.

  3. Proposed System

    In the Proposed System We propose a novel personalized image search framework by simultaneously considering user and query information. The users preferences over images under certain query are estimated by how probable he/she assigns the query-related tags to the images.

    • A ranking based tensor factorization model named RMTF is proposed to predict users anotations to the images.

    • To better represent the query-tag relationship, we propose to build user-specific topics and map the queries as well as the users preferences onto the learned topic spaces.

    • User profile is proposed to be created.

    • The proposed architecture is to be implemented using Three tier architecture for more accuracy and independency of layers.

  4. Objectives & Scope

    The objectives of the present research are

    1. A Ranking based Multi-correlation Tensor Factorization model is proposed to perform annotation prediction, which is considered as users potential annotations for the images.

    2. We introduce User-specific Topic Modeling to map the query relevance and user preference into the same user-specific topic space. For performance evaluation, two resources involved with users social activities are employed. Experiments on a large scale Flickr dataset demonstrate the effectiveness of the proposed method.

    3. The Proposed Methodology reduces the Image search time which in turn advantageous to the user.

    1. Methodology

      The research is implemented on A Three Tier Architecture to divide the large computational cost at different places, which will contain

      Client Tier:

      A local computer on which either a Web browser displays a Web page that can display and manipulate data from a remote data source, or (in non-Web-based applications) a stand-alone compiled front-end application

      Middle or Application Tier:

      A Server computer that hosts components which encapsulate an organization's business rules. Middle- tier components can either be Active Server Page scripts executed on Internet Information Server, or (in non-Web-based applications) compiled executables.

      Data Tier:

      A computer hosting a database management system (DBMS), such as a Microsoft SQL Server database. (In a two-tier application, the middle-tier and data source tier are combined.)

      Figure 2- Three Tier Architecture

      Modules Description


      Users may have different intentions for the same query, e.g., searching for apple by a cel phone fan has a completely different meaning from searching by an fruit specialist. One of the solution to address these problems is personalized search, where user-specific information is considered to distinguish the exact intentions of the user queries and re-rank the list results. Given the large and growing importance of search engines, personalized search has the potential to significantly improve searching experience.


          In the research community of personalized search, evaluation is not an easy task since relevance judgment can only be evaluated by the searchers themselves. The most widely accepted approach is user study, where participants are asked to judge the search results.

          Obviously this approach is very costly. In addition, a common problem for user study is that the results are likely to be biased as the participants know that they are being tested. Another extensively used approach is by user query logs or click through history. However, this needs a large-scale real search log, which is not available for most of the researchers.

          Social sharing websites provide rich resources that can be exploited for personalized search evaluation. Users social activities, such as rating, tagging and commenting, indicate the users interest and preference in a specific document. Recently, two types of such user feedback are utilized for personalized search evaluation. The first approach is to use social annotations. The main assumption behind is that the documents tagged by user with tag will be considered relevant for the personalized query. Another evaluation approach is proposed for personalized image search on Flickr, where the images marked Favorite by the user u are treated as relevant when u issues queries. The two evaluation approaches have their pros and cons and supplement for each other.

          We use both in our experiments and list the results in the following.

          1. Topic-based: User can view image topic-based personalized search

          2. Preference-based: User can view image user interests-based preference.


      Photo sharing websites differentiate from other social tagging systems by its characteristic of self-tagging: most images are only tagged by their owners. The # tagger statistics for Flickr and the webpage tagging system Del.icio.us. We can see that in Flickr, 90% images have no more than 4 taggers and the average number of tagger for each image is about 1.9. However, the average tagger for each webpage in Del.icio.us is

      6.1. The severe sparsity problem calls for external resources to enable information propagation. In addition to the ternary interrelations, we also collect multiple intra-relations among users, images and tags. We assume that two items with high affinities should be mapped close to each other in the learnt factor subspaces. In the following, we first introduce how to construct the tag affinity graph, and then incorporate them into the tensor factorization framework.

      To serve the ranking based optimization scheme, we build the tag affinity graph based on the tag semantic relevance and context relevance. The context relevance

      of tag is simply encoded by their weighted co- occurrence in the image collection.

      Figure 1- Work Flow

    2. Conclusion

      In addition to creating content, users of Web 2.0 sites generate large quantities of metadata, or data about data, that describe their interests, tastes and preferences. These metadata, in the form of tags and social networks, are created mainly to help users organize and manage their own content. These types of metadata can also be used to target relevant content to the user through recommendation or personalization.

      This proposed work describes a machine learning- based method for personalizing results of image search on Flickr. Our method relies on metadata created by users through their everyday activities on Flickr, namely the tags they used for annotating their images and the groups to which they submitted these images. This information captures user's tastes and preferences in photography and can be used to personalize image search results to the individual user. We validated our

      approach by showing that it can be used to improve precision of image search on Flickr for three ambiguous terms: newborn, tiger, and beetle. In addition to improving search precision, the tag-based approach can also be used to expand the search by suggesting other relevant keywords (e.g.,

      pantheratigris, bigcat and cub for the query



      1. Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Bocca, J. B., Jarke, M.& Zaniolo, C. (Eds.), Proceedings of the 20th Int. Conf. Very Large Data Bases, VLDB (pp. 487 499). Morgan Kaufmann.

      2. Breese, J., Heckerman, D.& Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the 14th Annual Conference on Uncertainty in Artificial Intelligence (pp. 4352). San Francisco, CA: Morgan Kaufmann.

      3. Golder, S.A. & Huberman, B.A.(2006). The structure of collaborative tagging systems. Journal of Information Science 32(2), 198- 208.

      4. Jin, R., Si, L., & Zhai, C. (2006) A study of mixture models for collaborative filtering. Information Retrieval 9(3):357382.

      5. M. J. Carman, M. Baillie, and F. Crestani,

    Tag data and personalized information retrieval, in SSM, 2008, pp. 2734.

    Mr. Naeem Naik received B.E degree

    in Computer Science and Engineering in 2010 from VTU University, Karnataka, India and pursuing the M.

    E. degree in Computer Science and Engineering in Walchand Institute of Technology, Solapur, India. He is doing his dissertation work under the guidance of Mr. Lobo L.M.R.J, Associate Professor & Head, Department of IT, Walchand Institute of Technology, Solapur, Maharashtra, India.

    Mr. Lobo L.M.R.J received the B.E degree in Computer Engineering in 1989 from Shivaji University, Kolhapur, India and M. Tech degree in Computer and Information Technologyin 1997 from IIT, Kharagpur, India. He is registered for Ph.D in Computer Science and Engineering at SGGS, Nanded of Sant Ramanand Teerth Marathawada University, Nanded, India. Under the guidance of Dr. R.S. Bichkar. He is presently working as an Associate Professor & Head, Department of IT Walchand Institute of Technology, Solapur, Maharashtra, India. His research interests include Evolutionary Computation, Genetic Algorithms and Data Mining.

    Mr Riyaz Jamadar received his BE Electrical and Electronics from BLDE Association's College of Engineering and Technology. He did his

    MTech in Computer Science from Allahabad University, Allahabad. His Area of Interests is Image processing, Data structures and Algorithms and Wireless Adhoc mobile technology.

Leave a Reply