A Review on Resource Discovery Strategies in Grid Computing

Download Full-Text PDF Cite this Publication

Text Only Version

A Review on Resource Discovery Strategies in Grid Computing

Sakshi Sivarama Krishna1, J S Ananda Kumar2 1 KMM Institute of Technology and Science, Tirupati 2 KMM Institute of Post Graduate Studies, Tirupati

AbstractTechnological, Finance, Scientific, engineering and other applications and in specific grand challenge applications are becoming ever more demanding in terms of their computing requirements. Gradually, all these requirements can no longer be managed by a single organization. A Cost-effective modern technology that could address the computing bottleneck is grid computing, which enable clusters of organizations to share their computing ability in an effective way. The resource management system is the main component of any network computing system. There is a lot of work in the hands of Grid computing research community focusing on network computing that have designed and implemented resource management systems with a variety of architectures and services. The discovery of a resource that meets the requirements of a Grid user specific to his query, plays an important role in Grid management system. In todays seamless computing environment effective use of pooled resources is a challenging task. Various strategies that support the Grid system management are reviewed in this paper .The need to polish these strategies and the role of Data mining technologies in this betterment are the key components of this paper. The means to apply Data Mining techniques for better Grid management are identified.

Keywords Grid computing, Resource Discovery, Grid Management System, Data Mining techniques

  1. INTRODUCTION

    A Grid system is a seamless, integrated computational and collaborative environment and a high level view of activities within the Grid. The resource requesters interact with the Grid resource broker for solving problems, which in turn performs resource discovery, scheduling, and processing application jobs on the distributed Grid resources). Grid computing is a seamless support to resource sharing in large and heterogeneous network environment collecting valuable resources from large collection of providers. Grid computing provides a cost effective resource sharing mechanism using collaborative management system [1]. Grids, composed of large-scale, geographically distributed platforms working together, have emerged as effective architectures for high performance decentralized computation. As there are limitations on available resources in grid environment, efficient use of the grid resources becomes an important challenge. Due to the heterogeneity of resources it is always a challenge to pick up the right resource at the right time with the optimum cost. Resource management system keeps the task of managing available resources and system workloads accordingly. It is the way in which various tasks like allocation, authorization and assurance of resources are taken place.

    Resource management system includes resource discovery, resource monitoring, resource accounting, resource provisioning, fault isolation, autonomic capabilities and service level management activities. A core part of the grid system is Resource Broker, which heads the whole hierarchy of maintaining all the service information of the system. It performs various processes like discovering, scheduling, allocating and evaluating. Resource discovery is the first and the foremost important process. Broker has a significant role in discovering the resources before implementing other grid resource management schemes. Resource management system performs resource discovery to obtain information about the available resources for the particular job with respect to the required attributes. Discovery in the Grid environment is a complex task as the resources are distributed, heterogeneous and spread through organizations each having their own resource management policies and different access and cost models.

    Grid computing represents the evolution of distributed computing infrastructure and parallel processing technologies. The Grid enables coordinated resource sharing within dynamic organizations consisting of individuals, institutions, applications and resources. The main aim of grid computing is to provide collaborative parties and application developers the ability to create distributed computing environments. The driving force for this paper is to survey the resource discovery mechanism, with in the Grid environments. The resource discovery process out to find out the preferred resources quickly and return the result back to the requester in the optimum time. The information about the previous work related to the resource discovery approaches is presented in this paper in a systematic way. This presentation is structured to create curiosity towards finding for better strategies in resource discovery in Grid computing.

  2. CLASSIFICATION OF THE GRID SYSTEMS

    Current networked systems contain personal computers, personal digital assistants, parallel computers, network routers, network switches, clustered server farms, and network attached storage devices such as fiber channel RAID, automated tape libraries, and CD-RW/DVD jukeboxes. Grid systems based on design objectives can be categorized into the following classes as in Fig. 1.

  3. THE RESOURCE DISCOVERY PROCESS

    If a Grid (G) consists of n nodes and if a node N equipped with m resources(R), Ni = [R1, R2,…Rm] and

    11 12. . 1

    G=[N1,N2,..Nn] then G=

    21 22. . 2

    1 2. .

    The computational Grid category denotes systems that have a higher aggregate computational capacity available for a single application than the capacity of any constituent machine in the Grid system. These include distributed supercomputing and high throughput categories depending on how the aggregate capacity is utilized.

    The data Grid category provides an infrastructure for synthesizing new information from data repositories such as digital libraries or data warehouses that are distributed in a wide area network.

    The service Grid category is for systems that provide services that are not provided by any single machine. This category is further subdivided in on demand, collaborative, and multimedia Grid systems [2].

    The resource discovery process is summarized in Fig. 2. The user submits a job and the grid finds appropriate resources, in principle anywhere, to complete it. A node, searching for resources, submits a request containing an ontological description of a target resource. In order to perform the request distribution over the network, suitable communication and routing protocols are need to be employed .Each node compares the incoming request against its ontology, by applying some criteria for matching. The matching evaluation depends on the expressiveness of the ontological description contained in the request and on the desired level of accuracy. Replying to the requesting node, each node returns a list of candidate resource descriptions, together with their respective matching values. Based on received candidate descriptions and on their matching values, the initiated node selects the relevant resources descriptions to be traced in its ontology. For each relevant resource description, a new concept in the Network Knowledge Layer is created, which describes the sending node which have provided such candidate resource descriptions.

    Fig 2. Resource Discovery in Grid Environment

  4. RESOURCE DISCOVERY STRATEGIES

    In Grid computing, resource discovery is the process of finding the best candidate from the set of resources that meet the requirements of the ime, cost and efficiency at optimum level. The dynamism and heterogeneity of resources that enter and exit to and from the Grid computing structure made it a challenging task to discover and manage them.

    There are diverse Approaches for resource discovery in grid environments. Basically these are query and agent based resource discovery approaches.

    In Query based Approach, discovery is the process in which resource information is queried for resource availability. This is the Approach for the most of the contemporary grid systems. Query based system are further characterized

    depending on whether the query is executed against a distributed database or a centralized database [3].

    Agent based Approach sends active code fragments across machines in the Grid system that is interpreted locally on each machine. Agents passively monitor and can distribute resource information either periodically or in response to another agent. The Agent is a software entity with aptitude, autonomy and right information which can interact with its surrounding and execute task on behalf of its user [4].

    Agent based resource discovery is inherently distributed mostly based on an underlying mobile code environments like Java.

    In this paper resource discovery Approaches like Peer-to-Peer Approach, Ontology Description-Based Approach, Routing Transferring Model-Based Approach, Parameter-Based

    Approach, Quality of Service (QoS) Approach and Request Forwarding Approach have been reviewed.

    In Peer-to-Peer Approach, based on decentralized resource discovery architecture could lessen huge administrative burden providing very effective search-performance result. Different resource discovery problems in a large distributed resource-sharing environment especially in a grid environment are of great interest [5], [6]. There are four different architectural components called Membership protocol, Overlay construction, Preprocessing, and Request processing and four environment parameter factors , which dominate the performance and design strategies for a resource discovery solution are Resource information distribution and density, Resource information dynamism, Request popularity distribution, Peer participation. A general- purpose query support enabled Unified Peer-to-Peer Database Framework (UPDF) for a large distributed system has been proposed in the literature.UPDF can be identified as a Peer-to-Peer database.

    Construct for a general purpose query support which is unified because it supports arbitrary query languages, random node topologies, different data types, different query response modes, different neighbor selection policies for expressing specific applications [7].

    In Ontology Description- Based Approach, Ontology refers to a description of a resource, a semantic service discovery framework in a grid environment [8]. Ontology improves the interoperability between virtual organizations. A service matchmaking mechanism is proposed based on ontology knowledge which claims that this matchmaking framework can provide a better service discovery and also can provide close matches. The main idea behind this Approach is the propagation of the resource information. In this model, service provider registers its service description into the service registry database.

    When a Grid application sends a request to service directory, matchmaker returns the matches to the service requester. Requester chooses the best resource based on the specifications given.

    In Routing Transferring Model-Based Approach [9] move around three basic components – resource requester, resource router and resource provider. The resource provider sends the resource information to a router and the router stores that information in a router table. After that, when the requester sends a request to the router, the router checks its routing table for an appropriate resource provider and after finding that entry router forwards that request to the service provider or another router.

    Distance Routing Transferring (SD-RT) algorithm is a base for this formalization. Here the resource discovery time depends on topology and they also showed that SD-RT could locate a resource in the shortest time, if the topology and distribution of resources are clear.

    In Parameter-Based Approach the potentiality of the Grid is considered [10]. A new concept Grid potential is proposed in this model, which encapsulates the processing capabilities

    of different resources in a large network. An algorithm called Data Dissemination Algorithm is proposed. This algorithm considers a swamping Approach for message distribution. At the time a message comes to a node, that message gets validated. The validation process relies on three types of dissemination, universal awareness that permits all incoming messages, neighborhood awareness that allows messages from a certain distance, and distinctive awareness, which discards messages if it finds out that the less Grid potentiality at the local node in remote node, is less than that of the requestor node. The authors of this approach also measured the performance of universal awareness, neighborhood awareness, and distinctive awareness dissemination schemes and claimed that universal Approach is more expensive in terms of message complexity than that of neighborhood and distinctive Approach. They also claimed that this new class of dissemination could reduce the communication overhead during the resource discovery.

    In Quality of Service (QoS)-based Approach an algorithm to discover the occasionally available resources in a multimedia environment is proposed [11]. Here different policies for a QoS based resource discovery service for a given graph theoretic Approach are mentioned. A generalized version of Discovering Intermittently Available Resources (DIAR) algorithm based on occasionally available resources is presented. The performance of QoS policies based on different time-map strategies in a centralized system is evaluated. Various QoS parameters include processor runtime, storage capacity, network bandwidth and many more. On the basis of these parameters QoS guarantees the best behavior of grid. Through the experiment they found out randomized placement strategies and increased server storage can facilitate better performance to discover a particular resource.

    In Request Forwarding Approach four-request forwarding alternatives are identified.

    1. Random Walk Approach

      To forward the request, the node is chosen randomly.

    2. Learning-Based Approach

      A request is forwarded to a node who served a similar request before. If no similar answer is found, the request is forwarded to a randomly chosen node.

    3. Best-Neighbor Approach

      The number of received answer is recorded without recording the type of requests. The request is forwarded to that node which answered highest number of requests. This Approach to identical to learning-based Approach except when no similar answer is found, request is forwarded to the best neighbor. The measured performance evaluation of a simple resource discovery technique is based on request propagation.

    4. Learning-Based + Best-Neighbor Approach

    This Approach is identical to learning-based Approach except when no similar answer is found, Request is forwarded to the best neighbor.

  5. COMPARISON AMONG THE STRATEGIES

    Many parameters are to be taken into account for dealing with complexity of grid resource discovery as it becomes complex in proportion to the size of the grid. In the case of the large global grids, Peer-to-Peer is the best Approach to be followed, as it uses the graph theoretic Approach to achieve scalability and manageability. De-Centralized resource discovery Approach is the next choice, but is limited to certain extent because of its dependent and managing units.

    Ontology based Approach uses the central broker for matchmaking, which reduces its scalability. A Grid Potential based approach helps th parameter Approach to achieve the efficient way to maintain the current status of resources. QoS Approach helps the clients with the reliable feedback about the expected behavior of the system as a whole. It is observed that Routing Transferring Approach is the fastest way to discover the resources for small grids as it uses the shortest distance routing table for matchmaking. Request Forwarding Approach is not moved into full usage used today because of its nonstandard nature.

    All the strategies have their own problems and prospects. Grid computing needs to maintain large amount of data with respect to its components, coordination and computation needed to manage resources. Analyzing the data stacked as a result of Grid management activities certainly provides some means to polish the above mentioned strategies [12]. To do such large amount of analysis Data mining technologies are one of the best choices to rely upon.

    Grid can be considered as a set of independent, complex systems, constructing together a huge pool of computational resources. Therefore, the management procedures are subjected to a specific analysis of each participant system. Ultimately, the decision making rely up on a detailed knowledge of each of the elements that make up a grid.

    The heterogeneous and distributed nature of the components of grids leads to complexity in analysis. Data mining techniques can contribute to relieve the burden to a great extent. The great variety of tasks that can be found in the grid offers a wide range of information to process. Data from multiple sources can be gathered and analyzed using data mining techniques to extract novel and vital information that is ultimately improve the grid management process. The resource discovery process can gain a lot from the knowledge obtained from the Data Mining techniques and this combination can shape the resource discovery strategies and processes towards economic utilization of Grid resources.

  6. CONCLUSION

Resource Discovery is the process of finding the satisfactory resources that meets the users specifications, including resource description, resource organization, resource lookup, and resource selection. In this paper

The resource concept in Grid system is introduced together with a review and analysis of various grid resource discovery Approaches is made. Comparisons are made on all these Approaches on the basis of performance factors like scalability, reliability, adaptability and manageability. This comparison guides us towards an idea about choosing an

appropriate approach to discover a particular resource. Our assumption said that peer-to peer approach has succeeded in the world of resource discovery. It is expected that the resource discovery concepts of peer-to-peer may be adapted one service using in grid systems. It should be noticed that it is not easy to recognize the grid system with peer-to-peer system, so we should explore the new model continuously which can be describe the components which are sufficient for both systems. Keeping in mind the vitality of large amount of data with respect to Grid components and its respective computational needs the importance of analyzing the resource data for better resource management system and the role of Data mining techniques is emphasized.

REFERENCES

  1. J. Joseph and C. Fellenstein, Grid Computing, IBM Press Pearson Education, pp. 5.

  2. R. Buyya, D. Abramson, J. Giddy, Nimrod/G: An Architecture for a Resource Management and Scheduling System in a Global Computational Grid, International Conference on High Performance Computing in Asia- Pacific Region (HPC Asia 2000), Beijing, China. IEEE Computer Society Press, USA, 2000.

  3. Chaitanya Kandagatla, Survey and Taxonomy of Grid Resource Management Systems, University of Texas, Austin http://www.cs.utexas.edu/users/browne/cs395f2003/project s/KandagatlaReport.pdf

  4. Klaus Krauter, Rajkumar Buyya and Muthucumaru Maheswaran, A taxonomy and survey of grid resource management systems for distributed computing,Copyright 2001 John Wiley & Sons, Ltd. 17 September 2001.

  5. A. Iamnitchi and I. Foster, On Fully Decentralized Resource Discovery in Grid Environments, IEEE International Workshop on Grid Computing, Denver, CO, 2001.

  6. A. Iamnitchi, I. Foster, Daniel C. Nurmi, A Peer-to-Peer Approach to Resource Discovery in Grid Environments, Proceeding of the 11th Symposium on High Performance Distributed Computing, Edinburgh,

    UK, 2002

  7. W. Hoschek, A Unified Peer-to-Peer Database Framework for Scalable Service and Resource Discovery, Proceeding of Third International Workshop on Grid Computing: GRID 2002, Baltimore, MD, Springer, 2002,pp. 126-144.

  8. S. Ludwig, P. Santen, A Grid Service Discovery Matchmaker based on Ontology Description, Euroweb 2002 -The Web and the GRID: from e- science to ebusiness, 2002.

  9. W. Li, Z. Xu, F. Dong, J. Zhang, Grid Resource Discovery Based on a Routing-Transferring Model, Proceeding of Third International Workshop on Grid Computing: GRID 2002, Baltimore, MD. , Springer, 2002, pp. 145-156.

  10. M. Maheswaran and K. Krauter, A Parameter-based approach to resource discovery in Grid computing systems, 1st IEEE/ACM International Workshop on Grid Computing, 2000.

  11. Yun Huang and Nalini Venkatasubramanian, QoS-based Resource Discovery in Intermittently Available environments, Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing HPDC-11 2002 (HPDC02) 1082-8907/02, 2002 IEEE.

  12. Werner Dubitzky, A text book on Data mining techniques in grid computing environments.

Leave a Reply

Your email address will not be published. Required fields are marked *