A Survey of Scheduling Algorithms for Heterogeneous Systems and Comparative Study of HEFT and CPOP Algorithms

Baldeep Singh; Priyanka Mehta

doi:10.17577/IJERTV5IS050045

Volume 05, Issue 05 (May 2016)

A Survey of Scheduling Algorithms for Heterogeneous Systems and Comparative Study of HEFT and CPOP Algorithms

DOI : 10.17577/IJERTV5IS050045

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 139
Total Downloads : 369
Authors : Baldeep Singh, Priyanka Mehta
Paper ID : IJERTV5IS050045
Volume & Issue : Volume 05, Issue 05 (May 2016)
DOI : http://dx.doi.org/10.17577/IJERTV5IS050045
Published (First Online): 13-05-2016
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

A Survey of Scheduling Algorithms for Heterogeneous Systems and Comparative Study of HEFT and CPOP Algorithms

Baldeep Singh

M. Tech Student

Dept. of Computer Science and Engg.

UGI Lalru, PTU Jalandhar, Punjab, India

Priyanka Mehta

Assistant Prof.

Dept. of Computer Science and Engg.

UGI Lalru, PTU Jalandhar, Punjab, India

Abstract- Scheduling computation tasks on processors is the key issue for an advanced computing. List scheduling has constantly been a topic of conversation for the researchers due to its nature of solving high complexity troubles with minimum complexity and to estimate the additional scheduling problems for the applied matrix. The task in hand is to do reduce the overall time utilizes for task completion. A lot of earlier research works have also included prioritization on the jobs to reduce the computation cost and earliest finish time of the system. This paper introduces a discussion about recently studied scheduling techniques namely: HEFT, CPOP. These two techniques are compared with each other in rest of Schedule Length, Speedup and Efficiency on different number of directed acyclic graphs.

Keywords: CPOP, HEFT scheduling, DAG.

INTRODUCTION

Heterogeneous Distributed computing includes resources of varying capacity forming a fast system to execute computationally intensive parallel and simultaneous applications. It becomes necessary to have the capacity to increase the use of computing and communication infrastructure for justifying heterogeneity the expense that may have gone into making of these resources. The computational requests of jobs and applications have been increasing quickly and the presentation of a figure rigorous application on such stages is intensely reliant on the portion of the tasks on these assets. One of the major issues in Heterogeneous Distributed Computing is to schedule the tasks of requests such that general running time is minimized. The task scheduling problem is known to be N-P complete [1, 2]. Many heuristics have been designed to solve the problem but still no satisfactory results have been found. Task scheduling is divided into two classes: static scheduling and dynamic scheduling. In static scheduling, which is done before the task is run, all the data associated with a parallel program, for example, task managing time, communication time and data dependency are already available. In dynamic scheduling, many scheduling decisions are taken on runtime. Hence the target of dynamic scheduling is to generate a schedule with in the scheduling overhead limit, in addition to internal failure Concerns and other aspects [3]. Special scheduling heuristics have been planned in the writing on principle of the methodologies these heuristics use. They have been characterizing into four clusters: List scheduling algorithms [3 4 7 8 10 12 14 18], Duplication based

algorithms [15 20],

Cluster based algorithms and Random guided search algorithms [4]. In list based heuristics, tasks are placed in a priority list with every task having special priority importance. Priority of a task depends on priority of its predecessor. A current task from the priority list all scheduled onto a suitable processor by using tasks EST and EFT. Clustering heuristics were mostly planned for standardized frameworks and the point is to frame clusters of tasks that are then transfer to processors. Clustering is the best way to minimize the communication delay within DAGs using grouping closely tasks within a cluster on the same processor. Duplication is very useful in parallel processing. Using duplication duplicate the task between idle times slots. The duplication heuristics deliver the most limited makespan yet they have a disadvantage i.e. the higher time complexity. Guided random scheduling techniques make use of the theory of evolution and ordinary genetics to produce near optimal task schedules. These genetic algorithms are very useful and most popular in list based scheduling.
TASK MODEL

An application is divided into set of tasks on a parallel and distributed computing environment. That application is represented with help of DAG (Directed Acyclic Graph) G = (V, E). Where V is a set of v nodes and each node vi V represents set of nodes of an application. E is a set of e coordinated edges between tasks, each e (i, j) E represent nodes in the graph and their dependencies. Edges e in the DAG speaks to the connection messages and is spoken to as (ni, nj). ni is a protector node or parent node. nj is child node. nj is executed after the execution of ni. The node with no section or child is called exit node [6]. At a time frame ni nodes needs to send data elements to nj with the exception that if both the nodes ni and nj are assigned to the same processor then there communication cost must be negligible. Mapping of nodes on best processor and for minimizing the execution time is totally based on ranking. Ranking is defined in many ways in literature. Weight w of every node communicates with the vertices to know the processing expenses w (n). W is a (v*q) describe computation cost matrix in each wi,j gives EET (estimated execution time) to complete a task ni on pj processor. Average execution cost of an application or task ni is describe as

j=1

j=1

wi =q

wi,j

/q (1)

which make choices to be made at runtime however with extra overhead. A dynamic algorithm is obliged on the basis that the

Transfer time of data among Processors are stored in a matrix B, and their size is q*q. Processors communication startup time is given in q_dimensional vector L. Task ni to task nk, (which is scheduled onto pm and pn) is communication cost among edges, describe as

Ci,k=L+datai,k/Bm,n (2)

Both the tasks are assigning onto same processor, their inter- processor communication is zero. The average communication cost between edges (i, k) is describe as

Ci,k=+datai,k/B (3)

Where transfer average rate onto processors is defined by and average communication startup time defined by. Now we describe the earliest startup time (EST (n , p )) AND earliest

workload is just known at runtime, similar to the status of every processor when new tasks arrive. As a consequence of this, a dynamic algorithm not ensures that it have all work essentials accessible among scheduling and can't promote in light of the entire workload. By separation, a static methodology can expand a schedule by allowing for all tasks freely of execution request or time in light of the fact that the schedule is created before execution start and present no operating cost at runtime.

The main problem of scheduling is to divide an application into different parts, these parts are known as nodes or jobs, problem is to find the priority of all nodes and fit into best suitable processors for reducing the running time of an application. We discus two list based scheduling heuristics (HEFT and CPOP) in heterogeneous environment and discus different type of scheduling problems with the help of different parameters like: Schedule Length, Speedup, and

i j Efficiency.

finish time (EFT (ni,pj)) of application ni onto processor pj, their entry task is

EST (nentry, pj) = 0. (4)

In order to calculate the EFT of a task ni, all immediate parent tasks of ni must have been scheduled shown in (5) and (6).

EST(ni,j)=max{avail[j],max/nmpred(nj) (AFT(nm)+cm,i)} (5)

EFT (ni, pj) = wi,j +EST(ni, pj) (6)

Where pred(ni) is the list of immediate parent applications of task ni and avail[j] is the minimum time which the pj is ready for execution This process is not executes in case of all the tasks are assigned to processors.

After the scheduling of all tasks onto available processors, they calculate their EST and EFT i.e. equal to the actual start time (AST) and actual completion time (ACT), or we can say overall completion time (nexit), i.e. schedule length or makespan, sometimes called critical path length. The makespan is describe as

Makespan = max {AFT (nexit)} (7)

Node with higher need is analyzed for Scheduling before a node with lower need. The main objective of scheduling problem is to assigning the tasks of a given application for execution against suitable processors and tries to minimize the schedule length or makespan.
SCHEDULING PROBLEM

The problem presented in this paper is the static scheduling of a reserved application in a heterogeneous structure with P set of processors, V set of vertices, E set of edges between two vertices. Overall mathematically it can be explained as G = (V, E) where V is the set of vertices and E is the edge between two vertices. As said above, task scheduling can be separated into Static and Dynamic methodologies [8]. Dynamic scheduling is satisfactory for situation where the framework and task parameter are not known at compile time,
IMPLEMENTED ALGORITHMS

HEFT (Heterogeneous Earliest Finish Time)

HEFT is a simple and best scheduling technique in static task scheduling in heterogeneous as well as homogeneous environment for limited number of processors. HEFT has two stages: prioritization phase and processor selection stage. Prioritization stage: first HEFT calculates the priority using upward ranking (ranku). An application is traversed in upward direction and find out the rank of all nodes in a list with the help of mean communication and mean computation cost. Generated list is arranged in decreasing order of ranku. HEFT uses a Tie breaking policy for selecting the nodes, which node or successor selects whose rank value is highest. Upward rank of task ni is described as:

Rank(ni) = Wi+maxnjsucc(ni) (cij+ranku(nj)) (8)

Wi is the mean computation cost, succ (ni) is the immediate child of node ni, ci,j is the mean communication cost of node (i,j). In case of two nodes have equal rank value selects randomly. In upward ranking, graph is traversed from exit node to entry node. Highest rank is same with exit node:

ranku(nexit) = W exit (9)

ranku(ni) is total critical path from source node to exit node including communication and computation cost of tasks.

HEFT ALGORITHM:

Input a graph along with communication cost computation and number of processors.

Compute the average mean value of communication and computation costs of each node.

Compute upward rank (ranku) by traversing the graph from exit node to entry node.

Generate a priority queue in decreasing order according to their ranku.

While

Unscheduled tasks in the queue Do

Select first rank task for scheduling and remove from queue. Assign task (ni) to processor pj

Schedule all the tasks and compute their EST and EFT. END.
CPOP (Critical Path on Processor)

HEFT is also called CPOP. CPOP used both ranking techniques upward and downward. CPOP computes the rank value of each node by adding both the techniques ranku+ rankd and set into a queue. An application is traversed from entry node (ni) to exit node (nj) is called downward ranking and traverse exit node to entry node is called upward ranking. CPOP has two steps: task prioritization phase and task allocation phase.

In task prioritization phase, tasks are prioritize according to their rank value (ranku+ rankd) with the help of communication and computation costs of DAG then set into a queue (decreasing order). CPOP used critical path (CP) of an application to find the longest path starting from entry node to exit node.

In second phase, tasks are selected according to higher rank value and selects for scheduling to best suitable processor which minimizes the execution time of task. CP nodes are scheduled on a processor which has less mean computation cost then other processors. CP of given DAG (shown in figure 1) is N1, N2, N9 and N10 and their mean computation cost on each processor is 66, 54 and 63 (P1, P2 and P3). CPOP chooses minimum processor cost from all i.e. called CP-P (Critical Path-Processor).

CPOP ALGORITHM:

Input a graph along with communication cost computation and number of processors.

Compute the upward rank (ranku) and downward rank (rankd). Downward rank is computed by traversing the graph from entry node to exit node.

Compute the priority ni = ranku+ rankd and arrange in a list.

|CP|= rank of entry node. [CP-Critical Path]

SETCP = set all nodes on critical path nk nentry

While

nk is not exit node do

Select nj where (nj succ (nk)) and (priority (nj==

|CP|))

Select the Critical Path Processor (CP-P) niCP node

wij wij computation cost. Initialize the priority list with starting node.

While

There are unscheduled nodes in list do

Select the highest rank node from list and ready to schedule then remove from list.

If ni CP node then

Assign task ni to CP-P

Else

Assign task to processor pj, which reduces the EFT (ni, pj) Update the priority list

End.

These two algorithms are based on insertion based policy; a task is scheduled in processor earliest idle time slot which has already scheduled tasks, that large enough to hold a task. These tasks are schedule on the same processor.

TABLE: 1 computation cost

Figure: 1: Directed Acyclic Graph

Table 1 represents the computation cost of each processor on every processor and Figure 1 represents an application shows various type of nodes with their communication cost. Communication is the transfer rate between two nodes on different processors. We discuss this application or DAG on HEFT and CPOP, in heterogeneous environment.

EXPERIMENTAL RESULTS AND ANALYSIS The results are discussed of HEFT and CPOP algorithms under three parameters namely: schedule length, speedup and efficiency.

Comparison metrics: Using these metrics, we discuss comparison between above two algorithms based on:
1. Schedule Length: Schedule length (makespan) is the total execution time of an application or a DAG.
2. Speedup: Speedup is defined as the ratio of given schedule length is divided with obtained fastest processor.
3. Efficiency: speedup is divided with number of processors in each run
  
  We analyze the results on 50 different acyclic graphs with the variation in increasing the number of nodes (8 10 12 14 16 18
  
  20 22 24 26). Performance is increased with increasing the number of nodes.
  
  Graph: 1 comparison of HEFT and CPOP w.r.t to average schedule length.
  
  Grapp shows HEFT gives better results then CPOP for average schedule length. HEFT is 12% more efficient than CPOP. If number of nodes is increased then schedule length is also increased. Graph 2 represents average speedup. HEFT is 15% more efficient then CPOP. Graph 3 represents HEFT is 11% better then CPOP in case of efficiency.
CONCLUSION AND FUTURE WORK

In this paper we discussed two algorithms namely HEFT and CPOP on different parameters like Schedule Length, Speedup and Efficiency. These algorithms are scheduled on different number of DAGs in static task scheduling algorithms in heterogeneous environment. Results show us HEFT is better than CPOP for all discussed parameters. As consider increasing the number of node. It provides better results for all the parameters.. The results given in the paper demonstrate the fact that there is still a scope of improvement in many aspects for all the algorithms in the literature. Although list scheduling is a vast area of research keeping in view the findings in the given survey it is clear that there is a need of deveoping a technique which can produce an efficient priority list for tasks to develop an assignment based algorithm so as to reduce the overall execution time (makespan).

Figure: 2 represent a scheduling graph with the help of figure 1 and table 1

(a) HEFT and (b) CPOP.

Graph: 2 comparison of HEFT and CPOP w.r.t average Speedup.

Graph: 3 comparison of HEFT and CPOP w.r.t average Efficiency.

REFERENCES

Garey, M. R. e D. S., A Guide to the Theory of NP- Completeness, W. H. Freeman & Co, New York, NY, USA , 1979.
Yang, T. and Gerasoulis, A DSC Scheduling parallel tasks on an unbounded number of processors, IEEE Transaction Parallel Distributed System, 5(9), pp:951967, 1994.
Ilavarasan, E. and Thambidurai, Low Complexity Performance Effective Task Scheduling Algorithm for Heterogeneous Computing Environments, Journal of Computer Sciences 3 (2), pp: 94- 103, 2007.
Topcuoglu, H. Hariri, S. and Wu, M.Y. Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Trans. Parallel Distributed System, 13(3), pp: 260274, 2002.
Adam, T. L. Chandy K. M. and Dickson J. R. A comparison of list schedules for parallel processing systems, Communication ACM, 17(12), pp: 685690, 1974.
Kwok, Y.K. and Ahmad, I., Benchmarking and comparison of the task graph scheduling algorithms, J. Parallel Distrib. Comput, 59(3), pp: 381422, 1999.
Liu, G. Q. Poh, K. L. and Xie, M. Iterative list scheduling for heterogeneous computing, J. Parallel Distrib. Comput, 65(5), pp: 654665, 2005.
J.-J. Hwang, Y.-C. Chow, F. D. Anger, and C.-Y. Lee, Scheduling precedence graphs in systems with interprocessor communication times, SIAM Journal on Computing, vol. 18, no. 2, pp. 244257, 1989.
M. Y. Wu and D. D. Gajski, Hypertool a programming aid for message-passing systems, IEEE Transactions on Parallel and Distributed Systems, vol. 1, no. 3, pp. 330343, 1990.
G. C. Sih and E. A. Lee, Compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures, IEEE Transactions on Parallel and Distributed Systems, vol. 4, no. 2, pp. 175187, 1993.
H. El-Rewini and T. G. Lewis, Scheduling parallel program tasks onto arbitrary target machines,Journal of Parallel and Distributed Computing, vol. 9, no. 2, pp. 138153, 1990.
M. Iverson, F. Ã–zgÃ¼ner, and G. Follen, Parallelizing existing applications in a distributed heterogeneous environment, in Proceedings of the IEEE International Conference on Heterogeneous Computing Workshop (HCW '95), pp. 93100, 1995.
E. Ilavarasan, P. Thambidurai, and R. Mahilmannan, High performance task scheduling algorithm for heterogeneous computing system, Volume 3719 of Lecture Notes in Computer Science, pp.193-203. Springer, 2005.
Luiz F. Bittencourt, Rizos Sakellariou and Edmundo R. M. Madeira, DAG Scheduling Using a Lookahead Variant of the Heterogeneous Earliest Finish Time Algorithm, 18th Euromicro Conference on Parallel Distributed and Network-based Processing, pp. 27-34, 2010.
Savina Bansal, Padam Kumar and Kuldip Singh, Dealing with heterogeneity through limited duplication for scheduling precedence constrained task graphs, J. Parallel Distrib. Comput, vol. 65, pp. 479 491, 2005).
Tomasz Kalinowski , Iskander Kort and Denis Trystram ,List scheduling of general task graphs under LogP ,Parallel Computing, vol.26, pp. 1109-1128, 2000.
R. Eswari and S. Nickolas, A Level-wise Priority Based Task Scheduling for Heterogeneous Systems International Journal of Information and Education Technology, Vol. 1, No. 5, pp. 371-375, December 2011.
Mohammad I. Daoud and Nawwaf Kharma, A high performance algorithm for static task scheduling in heterogeneous distributed computing systems J. Parallel Distrib. Comput, 68, pp. 399 409, 2008.
Samia Ijaz, Ehsan Ullah Munir, Waqas Anwar, and Wasif Nasir, Efficient Scheduling Strategy for Task Graphs in Heterogeneous Computing Environment The International Arab Journal of Information Technology, Vol. 10, No. 5, pp. 486-492, September 2013.
Doruk Bozdag and Fusun Ozguner, Comparison of Schedules and a Two-Stage Approach for Duplication-Based DAG Scheduling IEEE Transactions on Parallel and Distributed System, VOL. 20 NO.6, pp: 857-871, June 2009.

A Survey of Scheduling Algorithms for Heterogeneous Systems and Comparative Study of HEFT and CPOP Algorithms

Leave a Reply