A Review on Various Techniques used for Image Document Retrieval from Digital Library

DOI : 10.17577/IJERTV4IS030314

Download Full-Text PDF Cite this Publication

Text Only Version

A Review on Various Techniques used for Image Document Retrieval from Digital Library

Kirti D. Munot,

Department of Computer Engineering,

Matoshri college of engineering and research centre,Nashik, Savitribai Phule PuneUniversity,

Abstract In early days the information was presented in books and library were the main source to provide useful information to user. Nowadays information came in electronic format may contain different type of data such as text, images etc. To handle such heterogeneity, an efficient mobile image document retrieval various system are proposed. This is interesting and challenging task to recognize the textual and line drawing part from the query image document from digital library. This paper provides a detailed review of past work on document image retrieval techniques and summarizes the limitations of past approaches.

Keywords Digital library, mobile visual search, line drawing retrieval, shape context, reranking,histogram,image retrieval

  1. INTRODUCTION

    In the electronic world digital library has played an important role in accessing the quantity of very large scanned documents stored in the digital image format.Due to following features we can easily access quantity of data digital library having following features.

    • No physical boundary:The user of a digital library need not to go to the library physically; people from all over the world can gain access to the same information, as long as an Internet connection is available.

    • Round the clock availability:A major advantage of digital libraries is that people can gain access 24/7 to the information.

    • Information retrieval: The user is able to use any search term (word, phrase, title, name, subject) to search the entire collection. Digital libraries can provide very user-friendly interfaces, giving click able access to its resources.

    • Space: Whereas traditional libraries are limited by storage space, digital libraries have the potential to store much more information; simply because digital information requires very little physical space to contain them and media storage technologies are more affordable than ever before.

    • Easily accessible.

    Typically a query is formulated as a photo that captures the visual objects of user interest, for example, a book cover, a document page, a figure, or even a line drawing. The visual query is sent to the server end, where the visually similar documents are matched and returned. To improve the image matching efficiency, the extracted visual signatures of database images have to be indexed, typically by an inverted indexing table. Comparing to typing query keywords, a snapped photo based query undoubtedly simplifies the input of a user query.

    For an efficient image document retrieval system in digital library, there are three challenges to be addressed from the perspective of search[1]. The first challenge is the photograph distortion of the embedded camera on a devices. Different from scanned document images, there is the distortion of mobile captured images due to which the search performance is degraded. The second challenge is how properly the textual regions is describe from an image document. By considering this challenges the Ling-Yu Duan proposed a framework for image document retrieval.

    And the last is the query delivery latency. Because due to late query delivery the time for retrieval is also increases so by considering this the compression technique is applied in[1] the paper.

  2. RELATED WORK

  1. Document Image Retrieval

    The problem of document image retrieval has been widely studied, due to a wide variety of applications in digital library. The main target is to find the exact or similar documents by querying a printed or scanned document image over a large document library. In the Previous an Optical Character Recognition (OCR) techniques is used. More recently, visual matching is becoming a promising alternative to solve the limitation of poor OCR performance in scanned documents retrieval.Mao et al. [12] proposed a paragraph structure analysis approach, which attempts to segment a document image into paragraphs, and then search the line drawings and textual regions, respectively.

  2. Visual search

    In the recent years, more exciting research and applications have been pursued in mobile visual search. For example, Zhang et al. [3] adopted sequential matching of more than one reference views to estimate the pose and motion direction for better location recognition. Schindler

    et al. [4] presented an approach for large-scale location recognition based on geo-tagged video streams, which works on multi-path search over the vocabulary tree. Eade et al. [5] adopted a vocabulary tree based approach for real- time loop closing. Yeh et al. [6] proposed a hybrid color histogram to compensate its original ranking results in location recognition by using mobile devices.

  3. Line Drawing Descriptor

    With the fast growing computation power of mobile devices, recent work [7], [8], [9] proposed to directly extract compact descriptors at the terminal, and send the compact descriptors instead of the query images towards low bit rate visual search.

    Fig 1 Text querying and image retrieval

    The framework includes three key components, as shown in Fig. 1:

    • Local Inner-distance Shape Context (LISC) to describe line drawings, which is robust against the mobile photographing distortion.

    • Hamming Distance KD-Tree to seek the tradeoff between document retrieval performance and memory complexity of indexing structure in searching textual regions.

      For effective and efficient image retrieval various K-D tree based approximation algorithm are in the use such as nearest neighborhood search ,with a hamming scheme. The motivation is to introduce a compact binary code to reduce the memory cost from storing original local descriptors for backtracking. In the proposed Hamming Distance (HD) KD-Tree, the Euclidean feature space is replace with a Hamming space. The Hamming distance KD-Tree enables very fast similarity matching, while maintaining matching accuracy.

  4. Vocabulary Tree Technique

    The scheme [10] builds upon popular techniques of indexing descriptors extracted from local regions, and is robust to background clutter and occlusion. The local region descriptors are hierarchically quantized in a vocabulary tree. The vocabulary tree allows a larger and more discriminatory vocabulary to be used efficiently,

    leads to a dramatic improvement in retrieval quality. The most significant property of the scheme is that the tree directly defines the quantization. The quantization and the indexing are therefore fully integrated, essentially being one and the same.

    But in [10] the vocabulary is not updated dynamically so, there is problem of dynamic vocabulary updating.

  5. Edge Histogram Descriptor

How the edge histogram descriptor for MPEG-7 can be efficiently utilized for image matching is given in [11]. Since the edge histogram descriptor recommended for the MPEG-7 standard represents only local edge distribution in an image, the matching performance for image retrieval.. In this paper, to increase the matching performance, Dong Kwon Park proposes to use the global and semi-local edge histograms generated directly from the local histogram bins. Then, the global, semi-global, and local histograms of two images are compared to evaluate the similarity measure. Since we exploit the absolute locations ofedge in the image as well as its global composition, the proposed matching method is considered to be a more image content- based retrieval. Experiments on test images for MPEG-7 core experiment show that the method results better retrieval performance especially for semantic similarity.

The Edge histograms Descriptor is used to represent 3D shape based on the gradient histogram distribution of shapes and the semantic based similarity is used but this approach cannot handle the rotation invariance .

CONCLUSION

To overcome the issue of mobile document image retrieval from the digital library a different techniques are proposes for efficient retrieval. For that the re-ranking is also important to get the relevant result. In the previous work the focus is on the main part that is character and line drawing recognition from image document and also extracts the features from the document and stored it in the appropriate form to get the relevant and efficient result.

We can also use the re-ranking to improve the efficiency of resulted documents.

ACKNOWLEDGMENT

The author is thankful to IJERT, guide and parents for their blessing, support and motivation behind this work.

REFERENCES

  1. Towards Mobile Document Image Retrieval for Digital Library Ling-Yu Duan, Rongrong Ji, Zhang Chen, Tiejun Huang, and Wen Gao,IEEE Transactions on Multimedia 16(2): 346-359 (2014)

  2. Gert, Janet. "Selection for Preservation in the Digital Age." Library Resources & Technical Services. 44(2) (2000):97-104

  3. W. Zhang and J. Kosecka, Image based localization in urban environments, in Proc. 3DPVT, 2006, pp. 3340.

  4. G. Schindler andM. Brown, City-scale location recognition, in Proc. CVPR, 2007, pp. 17.

  5. E. Eade and T. Drummond, Unified loop closing and recovery for real time monocular SLAM, in Proc. BMVC, 2008, vol. 13, p. 136.

  6. T. Yeh, K. Tollmar, and T. Darrell, Searching the web with mobile images for location recognition, in Proc. CVPR, 2004, vol. 2, pp. II76

  7. D. Chen, S. Tsai, V. Chandrasekhar, G. Takacs, J. Singh, and B. Girod,Tree histogram coding for mobile image matching, in Proc. DCC,2009, pp. 143152.

  8. V. Chandrasekhar, G. Takacs, D. Chen, S. Tsai, R. Grzeszczuk, and

    B. Girod, CHoG: Compressed histogram of gradients a low bit-rate feature descriptor, in Proc. CVPR, 2009, pp. 25042511.

  9. R. Ji, L.-Y. Duan, J. Chen, H. Yao, J. Yuan, Y. Rui, and W. Gao, Location discriminative vocabulary coding for mobile landmark search,Int. J. Comput. Vision, vol. 96, no. 3, pp. 290314, 2012.

  10. C. S. Won, D. K. Park, and S.-J. Park, Efficient use of MPEG-7 edge histogram descriptor, Etri J., vol. 24, no. 1, pp. 2330, 2002.

  11. D. Nistér and H. Stewénius, Scalable recognition with a vocabulary tree, in Proc. CVPR, 2006, vol. 2, pp. 21612168

  12. S. Mao and A. Rosenfeld, Document structure analysis algorithms: A literature survey, in Proc. SPIE Electronic, 2003, pp. 197207.

Leave a Reply