Content-Based Audiovisual Archive Retrieval:Multimedia Database

DOI : 10.17577/IJERTCONV5IS01184

Download Full-Text PDF Cite this Publication

Text Only Version

Content-Based Audiovisual Archive Retrieval:Multimedia Database

Santosh Dodamani

Assistant Professor Department of Computer Engineering

Atharva College of Engineering, Mumbai, Maharashtra, India

Divya Kumawat

Assistant Professor

Aruna Pavate

Assistant Professor Department of Computer Engineering

Atharva College of Engineering, Mumbai, Maharashtra, India

Department of Computer Engineering Atharva College of Engineering, Mumbai, Maharashtra, India

Abstract In the last few years extensive request for user oriented multimedia information systems has developed.Multimedia database can be defined as a pool of storage and retrieval systems, in which large amount of media objects are created, searched, modifiedand retrieved. Multimedia is the combination of text, image, graphicsand animations, audio and video information. The addition of database application to handle multimedia objects requires organization of multiple media data streams. Apart from text retrieval, the current waves in web searching and multimedia documents retrieval are the exploration for and supply of images, audio, 3D extracts and video. The content-based multimedia information retrieval offers new techniques and methods for probing various multimedia databases over the world.The discussiondeals with a new standard for multimedia search based on content.

Keywords Multimedia database;content based retrivel; text based retrival; free browsing;CAS;

  1. INTRODUCTION

    The search result performance of the given query is predicted by Query difficulty estimation. It is a powerful tool which is used for multimedia retrieval and now it is becoming more popular. There are several techniques proposed to estimate the query difficulty in the textual information retrieval, but directly they cannot be apply for image search, since it will result in poor performance. Existing research on query difficulty estimation focuses on the text-based queries, while the difficulty of image and video retrieval related to multimedia queries has not been yet studied so far. In current years, the prevalence of social media systems, e.g., Flicker, Face book, and YouTube, has largely increase Internet's multi- media database. These enrich database triggers may leads to the growth of large number of multimedia research scenarios .The success of these social media system also benefits the Content based image retrieval [1]. Various content-based multimedia retrieval methods have been introduced by a large number of researchers. Beyond the methods for content-based image retrieval, audio retrieval and video retrieval, there also have been a wide-ranging of content-based retrieval methods for new media types, such as content-based retrieval of 3D model, culture artifacts, motion

    data, biological data, etc. [2]. Data mining can be defined as a method of extracting or "mining" knowledge from large amount of data. It refers to the process of discovering interesting knowledge from huge amounts of data stored in databases, data warehouses, or other information repositories [3]. A multimedia data mining is the process that includes the building of a multimedia data cube which enables multiple dimensional examines of multimedia data, primarily based on visual content, and the mining of multiple kinds of knowledge, including summarization, clustering,comparison, classification and association, [2]. Thus, in multimedia system the knowledge discovery contracts with non- structured information. In order to improve the results of the multimedia files, a database must be first preprocessed, followed by feature extraction. The significant patterns may be discovered with the help of generated features, using various data mining techniques [4].

    Everyone deals with multimedia at every walk of lives. We work with multimedia and are surrounded by multimedia. Due to the advancement of modern computer and information technology, multimedia structures play additional impact on our lives. Therefore, it is more interesting fact how to organize and structure this huge multimedia information so that we can get data easily at everypoint of time. To do so, multimedia database is a tool required to achieve andmaintain huge multimedia objects. Multimedia objects consist of animations, video, music, sounds, texts, graphics etc. Multimedia applications frequentlyreport filemanagement interfaces at different levels of abstraction such as hypertextapplication, audio -video distribution service, audio editor depending on the realstrength of multimedia database and its structure.

    Multimedia database is a kind of database like any other databases containingmultimedia collections [5]. Multimedia is defined as the combination of more than onemedia; they may be classified into two types Static Media and Dynamic media. Images, Text and graphics are categorized as static media; on the other hand, objects like- music, animation,audio, video, speech are categorized as dynamic

    media. Graphic imagesmay consist of cliparts, photographs, logos, and custom drawings. Sound consistsof voice narration, speech, music etc. Video data contains sound as well asimages. To achieve these documents multimedia database management system iscrucial. Multimedia database management system (MDBMS) can be well-defined as a softwarestructure that brings about a collection of multimedia data and provides access to usersto query and to retrieve multimedia objects. Normally, multimedia database comprisesaudio, movie, text, image, animation, video, sound etc. However, all informationisput in storage in binary form.

    The volume of cinematicinfo available in digital format has developed exponentially incurrent years. Terabytes of new images, video clips,and audio are produced and storedevery day, building up a huge, distributed and mostly unstructured repository of multimediainformation, much of which can be retrieved through the net, digitization, compression and archival of multimedia information has become popular,reasonable and direct, and there is anexpansive range of offered hardware andsoftware to care these tasks. Subsequent retrieval of the stored information, however,might require considerable additional work in order to be effective and efficient.

    In the field of information technology multimedia systems plays more and more impact on our lives. Therefore it is more challenging fact how to organize and structure this huge Multimedia information so that we can get more information easily at any point of time. This paper is organized into multimedia database concepts, characteristics, and data retrieval methodologies. The major focus is on how content address storage helps to store, retrieves multimedia content and comparative analysis of retrieval methods.

    1. MDBMS Definition

      • Multimedia Object:

        Timothy[6] a multimedia document or presentation containing one or more multimedia data.

      • Multimedia Data:

        Michae[7] defines as many kinds of media like audio images, video, graphics, hypermedia,hypertext, and other abstract data types.

      • Multimedia Database:

        Ross Lee Graham [8] defines as a database containing one or more multimedia object.

        Multimedia database management system (MDBMS) can be well-defined as a softwarestructure that is able to manage a collection of multimedia data and provides entrée to usersto query and retrieve multimedia objects

    2. Characteristics of MDBMS [11]

      • Descriptive search methods

        Query of Multimedia data should be based on descriptive and content-oriented search.

        Example: Piture of a man with a bluecap.

      • Device independent interface

        Software is able to perform utility on a wide variation of devices which hide the details of device control, but dealwith information on specific characteristics of available storage media such as write-many, read-only and write-once.

      • Format independent interface

        DBMS must hide internal storage format and offer conversions to formats requested by the applications it may be in GIF, TIFF, JPEG, and many more forms. This lets you changing to new storage knowledge without any impact on multimedia applications

      • Management of large amount of data

        DBMS must be capable of handling and managing huge amount of data. For managing information need of appropriate referring mechanism should be required.

      • Real time data transfer

        DBMS must perform read and write operation of continuous data in real-time. The data transfer of continuous data has a higher priority than other database multimedia actions.

      • Long transactions

        The transfer of large amounts of data will take a long time and must be done in a reliable fashion

        Fig1: Example of Multimedia Document

        Figure1. Show example of multimedia document which contains media like text, audio, video, images.

    3. Data Structure

    Multimedia data can be stored in databases as raw data, registering data, descriptive data, and text, image, videosequenceand audiosequence. Data can be stored in databases as structured data and/or unstructured data.

    • Unstructured (Unformatted)

      Data are offered in a part where content cannot be retrieved by gaining access to any structural details.

      Ex :Mr. Penguin is a student in the fifth tenure

    • Structured (Formatted)

      Data are stored in variables field or attributes with corresponding values.

      Ex: o.student.surname =Pertersen o.student.name = Harmayani

      o.student.age =36

      Different classes of operations are needed on multimedia databases likeinput, output, modification, deletion, comparison and evaluation.

  2. RETRIEVAL METHODOLOGIES

    There are basically three ways of retrieving previously stored multimedia data[9]:

    1. Free browsing: Allows users to browse through a collection of audio, images, and video files, and discontinue when they catch the preferred information.

      • Present user with a set of linkages to images

      • May include summaries like thumbnail images or video key frames etc.

      • Links may be structured or formatted in the form of categories, hierarchies

      • Easy to implement: Images and/or linkages may be in a database

      • Suitable for unplanned and irregular use only

    2. Text-based retrieval: Textual information in the form of metadata is included to the movie files through the cataloguing stage. In the retrieval phase, this supplementary information is used to guide text-based, conventional, query and search engines to find the desired data.

      • Uses descriptive metadata annotations

      • Well-suited with conventional query models

      • Supports semantics

      • Significant human effort is required

      • Vastlyindependent annotation method

      • Not scalable to huge or quicklyincreasing collections

    3. Content-based retrieval: users search the multimedia repository providing informationabout the real contents of the audio, image, or video clip. A content-based retrieval engine interprets this statistics in some way as to query the database and retrieve the contenders that are more likely to satisfy the users requests

    Multimedia retrieval systems often use combinations of methodologies.

    In the life cycle of information, data is actively created, retrieved, corrected, and changed. As life time of data over, it becomes less likely to alteration and ultimately becomes fixed but continues to be retrieved by multiple applications and users. This data is called fixed content. Conventionally, fixed content was not preserved as a specialized form of data and was stored using a variety of storage media, extending from optical disks to tapes to magnetic disks. Agrowth of fixed content throughout abusiness has resulted in an extraordinary growth in the amount of data and hence presents a challenge of managing fixed content. Additionally, users demand guarantee that stored content has not changed and require an immediate online access to fixed content. These requirements resulted in the growth of Content-

    Addressed Storage. Content Addressed Storage (CAS) is an object-based structure that has been purposely built for storing fixed content data [10]. It is premeditated for confident online storage and retrieval of fixed content. Contrasting to file-level and block-level data access that use file names and the physical location of data for storage and retrieval, CAS stores user information and its elements as unconnected objects[10]. This content address is derived from the objects binary representation. CAS provides an enhanced and centrally able to store solution that can support single-instance storage (SiS) to remove multiple copies of the same data.

    1. Benefits of CAS[10]

      • Content authenticity

      • Content integrity y

      • Location independence

      • Single-instance storage (SiS)

      • Retention enforcement

      • Record-level protection and disposition

      • Technology independence

      • Fast record retrieval

    2. CAS Terminologies [10]

      Below someCAS terminologies has been described which requires an understanding CAS system:

      • Application programming interface (API): A high-level execution of an interface that requires the information of how clients can create service requests. The CAS API exists on the application server and is accountable for storing and retrieving the objects in a CAS system.

      • Binary large object (BLOB): The genuine data without the descriptive information as known as metadata. The different bit sequence of user data signifies the actual content of a file and is freefrom the name and actual physical location.

      • Content address (CA): An objects address, which is created by a hash algorithm run across the binary representation of the object. However whilecreating a content address; the hash algorithm ruminates all phases of the content, returning a unique content address to the users application. A unique number is calculatedfrom the series of bits that consist of file content. If even a single character modifications in the file, the resulting CA is different. A hash output, as known as digest, is a type of impressionfor variable-length records. This result represents the file contents and is used to trace the file in a CAS system. The summary can be used to verify whether the data is authentic or has changed because of tools failure or humanintermediation. When a user attempts to retrieve or open a file, the server refers the CA to the CAS system with the suitable function to read the file. The CAS system routines the CA to find the file and permits it back to the application server.

      • C-Clip: A virtual package that comprises data (BLOB) and its related Content Descriptor File.

      • The C-Clip ID is the CA that the system returns to the client application. It is also referred as a C-Clip handle or C-Clip reference.

      • C-Clip Descriptor File (CDF): An XML file that the system produces while creating a C-Clip. This file comprises CAs for all referenced BLOBs and associated metadata. Metadata includes characteristics of CAS objects such as size, format, and expiration date.

  3. PROCEDURE OF STORING DATA OBJECT IN CONTENT ADDRESSED STORAGE:

    Fig 2 :Storage of data object in CAS [10]

    The process for storing the fixe content in the CAS is as follows:

      1. Novice users place the data to be stored to the CAS API via an application. The application server may also relate directly with the source e.g., an X-ray machine that produced this fixed content.

      2. The API divides the actual data (BLOB) from the metadata and the content address is designed from the objects binary demonstration.

      3. The content address and metadata of the object are then introduced into the C-Clip Descriptor File (CDF). The C-clip is then shifted to and stored on the CAS system.

      4. The CAS system again calculates the new objects CA as anauthentication step and stores the object. This guarantees that the content of the object has not altered.

      5. An acknowledgment is referredby the API after aimaged copy of the CDF and a protected replica of the BLOB have been securely stored in the CAS system. After a data object is warehoused in the CAS system, the API is given a C-Clip ID and C-Clip ID is kept local to the application server.

      6. Using the generated C-Clip ID, the application can deliver the data back from the CAS system.

  4. PROCEDURE OF RETRIEVAL OF DATA OBJECT IN CONTENT ADDRESSED STORAGE:

    Fig 3:Retrieval of data object from CAS [10]

    The process of data retrieval from CAS follows these steps:

    1. The novice user or an application demands an object.

    2. The application queries the local table of C-Clip IDs warehoused in the local storage and detects the C-Clip ID for the demanded object.

    3. With the help of the API, a retrieval demand is sent along with the C-Clip ID to the CAS system.

    4. The CAS system provides the demanded information to the application, which in turn provides it to the novice user

  5. COMPARATIVE ANALYSIS OF RETRIVAL METHODS

    Table 1 shows the comparative analysis of three retrieval methods with considering some parameters.

    Main Features

    Free Browsing

    Text-based retrieval

    Content-based retrieval

    Information stored in the form

    File reference contains a link to the data

    Descriptive metadata annotations

    Binary object, CLOB

    Types of database support

    Relational Databases and Multimedia

    Object Relational Databases and Multimedia

    Object Oriented Databases and Multimedia

    Example of database

    M S Access& more

    Oracle& more

    Jasmine& more

    Data

    Structure

    Structured Data

    Structured data,

    unstructured data

    Large & swamp

    with other data

    Querying

    Allows content based, hypertext based

    Allows content based, hypertext based, hierarchal

    Document(url, title, text, type, length, lastModify)

    Support

    extra file types as in OLE

    Specially designed classes for multimedia

    Specially designed types for multimedia

    Table 1: Comparative analysis of retrival methods

  6. CONCLUSION :

Multimedia Content becomes more and more popular. The volume of audio visual informationavailable in digital format has developed exponentially incurrent years. Managing such huge amount of information is difficult by using traditional direct attached storage, so the concept of content attached storage raise into picture. Content based storage and retrieval supports in organizationof an archival content. CAS has looked as an alternative to tape and optical solutions because it overwhelms many of theirobvious insufficiencies. CAS also meets the request to progress data availability and to accurately protect, position of, and guarantee service-level agreements for archived data.

REFERENCES

  1. 1. Yangxi Lian , Bo Geng , Linjun Yang , Chao Xu ,Wei Bian ,"Query difficulty estimation for image retrieval", Elsevier Journal on Neurocomputing , 2012, 48-53.

  2. Yi Yang , Fei Wu" Dong Xu , Yueting Zhuang , Liang-Tien Chia , "Cross-media retrieval using query dependent search methods", Elsevier Journal on Pattern Recognition (2010) 2927-2936.

  3. Xin Chen, Chengcui Zhang, and Wei-Bang Chen ,"A Multiple Instance Learning Framework for Incident Retrieval in Transportation Surveillance Video Databases", 2007 IEEE.

  4. Manuel Barrena, Elena Jurado, Pablo MarquezNeila, Carlos Pacpn, "A flexible framework to ease nearest neighbor search in multidimensional data spaces", Elsevier Journal on Data & Knowledge Engineering (2010) 116-136.

  5. Ross Lee Graham, Multimedia Database Technology: an introduction to MMDB, ITM Mid-Sweden University Thomas C. Rakow, Erich J. Neuhold, and Michae Lohr, Multimedia Database System: The Notions and Issues Integrated Publication and Information System Institute (IPSI).

  6. Timothy K. Shih, Distributed Multimedia Database, Multimedia Information Network Lab, Department Science and Information Engineering, Tamkang University, Taiwan, 2001.

  7. Michael M. David, Multimedia Databases Through the Looking Glass, Intelligent Enterprises Database, 1997

  8. Marques, O. and B. Fuhrt. 2002. Content-based visual information

    Retrieval, in Distributed multimedia databases: techniques & applications. T. K. Shih, editor. Pages 35 57.

  9. https://pdfs.semanticscholar.org/ae4f/1ff26fe0f97d2dcc10be14126fa55 d140a2f.pdf

  10. Information Storage and Management Participant Guide Volume 1 of 2EMC Education Services July 2009Content Addressed Storage

    Pages 1-11

  11. Ralf Steinmetz and Klara Nahrstedt,Multimedia: Computing Communications & Applications, Pages 466-475

Leave a Reply