 Open Access
 Total Downloads : 91
 Authors : Pham Nguyen Minh Nhut, Le Van Son
 Paper ID : IJERTV3IS120384
 Volume & Issue : Volume 03, Issue 12 (December 2014)
 Published (First Online): 22122014
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
A Genetic Algorithm for Resource Provisioning of Virtual Service Based on Homogeneous Shared Hosting Platforms
Pham Nguyen Minh Nhut1 ,
1 Department of Ecommerce,
Vietnam Korea Friendship Information Technology College,
Danang, Vietnam.
Le Van Son2
2 Departement of Information, Danang University of Education,
Danang University, Danang, Vietnam.
Abstract Optimal resource provisioning for virtual services in the Cloud computing is one of the most concerns nowadays. Multidimensional resource provisioning on a homogeneous shared hosting platform for virtual services is known as a NPhard problem. Therefore, it is necessary to apply the metaheuristic algorithms for estimating the outcome of the problem. In this paper, we have effectively applied a genetic algorithm to solve the problem. We defined a fitness function with the goal of minimizing the number of physical machines, and compared our algorithm to standard algorithms of vector packing problem via emulationbased program in various scenarios. The experimental results show that: In the cases of large number of services, execution times of GA are shorter than the execution times of standard algorithms of vector packing problem.
Keywords Resource provisioning; Cloud computing; Genetic Algorithm (GA); Vector Packing; Virtual machine.

INTRODUCTION
Virtual technology allows partitioning the resource of Y (Y 1) physical machines into S (S 1) virtual machines to execute the applications on demands. A system which consists of multiple physical machines with the same configuration connecting together for sharing resources is called a homogeneous shared hosting platform [3, 7, 9]. One of the challenges of this system facing is to minimize resources of the platform for virtual services while still ensuring the quality of service (QoS).
Resource management for shared hosting platforms has been investigated in many other studies [3, 4, 7, 9]. In particular, Urgaonkar et al. [3] propose a profiling technique for statistic of resource usage and minimum resource needs. Aron [4] and Casanova et al. [7] formulate the resource provisioning problem as a constrained optimization problem in which machines are considered as a monolithic resource. Stillwell et al. [9] further consider resource provisioning in a multi dimention resource, however they focus only on efficiency of resource provisioning. They formulate the problem of resource allocation as a mixed integer linear program (MILP), where the objective is to maximize performance and fairness through a metric known as minimum yield.
In this paper, we consider many aspects of resources and apply a linear objective function to minimize the number of physical machines. The resource provisioning problem is generally considered in both cases: static and dynamic, but we focus on solving the the problem for only static case (i.e., fixed resource needs). Moreover, the resource provisioning is known as a NPhard problem, therefore this paper employs metaheuristic algorithms to solve it [2,8]. The key contributions of the paper are as follows:

Modeling a resource provisioning as a linear programming problem and computing the complexity of the problem.

Solving the propblem by applying GA and defining the fitness function in order to minimize the number of physical machines.

Evaluating and comparing experimental results with other results obtained by applying standard algorithms of vector packing problem [5, 6].
The rest of the paper is organized as follows: Section II presents a mathematical model of the problem as a linear programming problem and establishs the complexity of the problem. Section III solves the problem by applying the standard algorithms of vector packing problem. Our solution presents in Section IV by employing GA. Section V follows by experimental results and comparisons in various scenarios. Finally, Section VI concludes the paper and opens some future work.


RESOURCE PROVISIONING FOR VIRTUAL SERVICES

Resource and resource needs
Lets consider a homogeneous shared hosting platform in which a cluster of servers having the same configuration and being interconnected by a highspeed network devices is deployed for sharing resource to virtual services [nÃªn tÃ¡ch thÃ nh 2 cÃ¢u]. Each service [in the platform] operates as a virtual machine and the system ensures that service requests are dispatched to appropriate servers.
When users request virtual clusters (VC), the system responds by sets of virtual machine (VM). These VM
problem represented by a linear programming problem with constraints and objective functions is as follows:
instances run on physical machines (PMs) under the control of a hypersivor [1] and consume resources at different portions. The hypersivor can enforce specic resource consumption portions for different VMs running on the physical machine. A Resource Provider (RP) is responsible for making decisions whether to reject or
admit a request, and allocates resource to each VM
xij
{0,1},ij Q,
j xij 1,
y j xij ,
i, j
i
i, j
(1)
(2)
(3)
instance. Our goal is to design GA algorithm operated as a part of the RA to determine the minimum number of PMs based on resouce needs for virtual services.
To supply resource needs for virtual services, each PM provides several resources, i.e., CPU, RAM space, I/O bandwidth, disk space. In fact, each virtual sevice has two kinds of resource need: rigid and fluid [9]. A rigid need represents a specific fraction of required resource. The service cannot benefit from a larger fraction and cannot operate with a smaller fraction than a rigid need. A fluid need specifies the maximum fraction of a resource that the service could use if alone on the server. The service cannot benefit from a larger fraction, but can operate with a smaller fraction than a fluid need if the cost is reduced.
The ratio between the allocated resource and the fluid resource need is known as the yield of the fluid resource need, and we call their value simply the service yield. Within a service, the utilizations of all resources corresponding to fluid needs are linearly correlated [9]. Therefore, the service yield of each service is able to present by a value between 0 and 1. In particular, if the service yield is 0, the service will not be allocated any resource (due to the procedure of resource allocation). If the service yield is equal to 1, the service will be allocated resource as requied. However, it should be considered the lower bound on the yield of a service, which determines by QoS requirement(s). This constraint is defined by a service's fluid need multiplied by the service's QoS requirement(s), which is socalled a constrained fluid need. It is assumed that rigid resource needs are completely independent from fluid resource needs.

Objective and constraints
Assume that each service is represented by a single VM instance which has a fixed resource (static case). Multidimention resource provisioning problem (MRSP) is formulated as follows:
Let S be services, i = 1, 2, , n; S > 0; Y be physical
i (ij (1 ik ) ik )rik xij 1, k, j (4)
and, object function is min j y j (5)
Constraint (1) defines the domain of the variables. Constraint (2) determines the state at which there exists a service i running on a physical machine j or not. Constraint

specifies the state at which a physical machine j is being used or not. onstraint (4) represents the state at which the fraction of total resource needs for service i is always less than or equal to the total resource of the physical machine j. The Eq. (4) implies that if resource need rik is fluid, then ik = 0 and the fraction of resource Dk used on the physical machine j is ij rik. If resource need rik is rigid, then ik = 1 and the fraction of resource Dk used on the physical machine j is rik.
Finally, constraint (5) is the objective function which denotes the number of physical machines used for providing resources to the virtual service. The objective is to minimize yj.


Computational Complexity
To determine the computational complexity of MRSP problem, lets consider MRSPDec, the decision problem associated with MRSP can be stated as: Is it possible to assign Si services, each of which has a resource needs rik to Yj physical machine?
Theorem 1. The decision problem MRSPDec is NPC.
Proof.

It can be clearly seen that the problem MRSPDec is NP. Because the solution, if it exists, can be verified in polynomial time.

Consider the vector packing problem presented in [5, 6] as NPC: Given a set A consisting of elements of the ddimensional vector represented by a dtuple: d

(ai , ai ,…, ai ) and set B consisting of elements of the d
i i j 1 2 d
machines having the same configuration, j = 1, 2, , m; Yj > 0. Each physical machine provides Dk types of resources, k = 1,, d. For each service i, rik denotes its resource need for resource type k, its value is between 0 and 1. We define a binary variable xij that is equal to 1 if
dimensional vector represented by dtuple: (1,1, …, 1), put the elements of the set A into the set B such that
iB j
ai 1,j 1,…,d .
service i runs on PM j and 0 otherwise. We use ik is a binary value that is equal 1 if rik is a rigid need, and 0 if rik
is a fluid need; ij is the yield of service i on a physical machines j; yi is the number of PMs which is used for providing resource to service i. The resource provisioning
A MRSP problem is reduced to vector packing
a
j
j
problem as follows: Let the number of physical machine be B (i.e., Y=B), the service be A (i.e., S=A), the number of resource types be j (i.e., k=j) and the resource need of service i of the resource k be i (i.e., rik= ai ). For each
service i, set ik = 0 (i.e., the fluid need is considered only,
for the rigid need, the proof will be similar). Clearly, the vector packing problem provides a solution to the MRSP Dec problem. In contrast, a solution of the MRSP Dec problem will provide a solution to the vector packing
physical machine Yj, such that:
i (ij (1 ik ) ik )rik xij 1 ,
k, j
problem. From (i) and (ii), Theorem 1 has been proofed.
As proofed above, this problem is a combinatorial optimization problem which is known as NPhard. To solve this kind of problem, several solutions have been proposed such as: The heuristic approaches applied for finding the best solution; The local searching employed to look for a local optimal solution; The approximation approaches by applying metahueristic algorithms. The next section presents the standard heuristic algorithms of the vector packing problem and a genetic algorithm for solving this problem.


SOLUTIONS BASED ON STANDARD ALGORITHMS OF VECTOR PACKING PROBLEM
The basic idea of the algorithms of vector packing problem applied to solve MRSP is that putting Dk dimension Si vectors into YJ (in this case the Si is virtual service, YJ is the number of physical machines and Dk is the types of resource). In this paper, we consider the standard algorithms: First Fit, Best Fit proposed in [5]. Before putting elements of Si into Yj, elements Si are sorted in a descending order based on the following criterias:

Lexicographical: Given k 1, k = { (S1,S2,,Sk),
i, 0 Si 1} and a, b k. a b iff a = b or the first nonzero component of ba is positive

Maximum Component: Given k 1, k = { (S1,S2,,Sk), i, 0 Si 1} and a, b k. a b iff the maximum component in b is not less than the maximum component of a.

Maximum Sum: Given k 1, k = { (S1,S2,,Sk),
i, 0 Si 1} and a, b k. a b iff the sum of components of b is not less than the sum of components of a.
Based on standard algorithms of vector packing problem, we construct algorithms to solve MRSP problem as follows:
Algorithm 1
Input: A set of services Si (each service has resource need rik corresponding to the type of needs ik) and the set of physical machines Yj corresponding service yield ij.
Output: A set of minimum physical machines yj
(respectively, xij = 1).
The steps of the algorithm:

Step 1: Based on the resource need rik, building
Dkdimensional vector Si with rik elements.

Step 2: Apply Criteria 1 to sort elements of vector
Si in a descending order.

Step 3: Apply the First Fit, Best Fit algorithms to place the elements of the vector Si into the

Step 4: If the resource needs have not completed then go to step 3.

Step 5: If the resource needs have been completed then the output is a set of physical machines Yj (this is the outcome of the objective function of the MRSP problem).
The combinatorial algorithms constructed from greedy algorithms, i.e., First Fit, Best Fit based on the 3 Criteria, therefore we have 6 algorithms including: BestFitDesSum, BestFitDesMax, BestFitDesLex, FirstFitDesSum, FirstFitDesMax vÃ FirstFitDesLex. If Dk is considered as a constant, those algorithms have the computational complexity of O(S.logY+S.Y).


SOLUTION BASED ON GENETIC ALGORITHM

Introduction to GA
GA [2, 8, 10, 11] is based on the law of biological evolution of life populations. Individuals pass a process of development and reproduction to the creation of new individual for the next generation. In the process of evolution, bad individuals (based on a given criteria, so called fitness) are eliminated. In contrast, qualified individuals are retained. Some concepts related to genetic algorithm are as follows:

Individual representation: It is the representation of individuals so that each individual is a solution of the current problem.

Fitness evaluation: It is the evaluation of the adaptability of each individual for a problem. The evaluation is based on the fitness function.

Crossover operation: It is the process of creating new individual based on current individual (called fathermother individual). Two child individual are generated by swapping from the fathermother individual genes.

Mutation: It is a process of creating new individual from a given individual by changing some of its genes.

Selection and Replacement: It is the process of selecting individuals from the current populations to create the next generation. In this process, if the individual has a greater than or equal criteria adaptation, this individual will be retained. Adaptability of individuals in a population is more complete more.

Terminate condition: A genetic algorithm is the random process, the algorithm can not be sure to stop after finite steps. Therefore, we typically have to define the terminate conditions for the algorithm.


GA for resource provisioning
As previously discussed, a genetic algorithm allows us to optimize the NPhard problems in certain time. This section focuses on GA applied for solving the resource provisioning problem for the virtual service as described in Section II.

Individual representation: Each individual is represented by a onedimensional integer array Si with length of n (Si: virtual service; i = 1, …, n), in which th ith element equal to j if service Si is allocated to the physical machine Yj(Yj: physical machine; j = 1, …, m).

Initial population generation: Generate P (P is a designed parameter) initial random individuals. An initial chromosome is obtained by randomly assigning a service Si to a physical machine Yj.

Fitness function: Let F (Y ) be the total percentage
the processes of initial, mutation, and crossover) may not be ineligible, socalled ineligible chromosome. But, They are still in use for new population in the proposed algorithm. After generating an ineligible chromosome, a greedy algorithm is applied to make it becoming an eligible chromosome. This algorithm goes through the physical machines in an arbitrary order, for any overloaded physical machines, the algoritm attempts to move services to others which have fewer loaded services. This approach has been practically evaluated, by which it helps to reduce the diversity of the chromosome population, and trends to an eligible resource provisioning.

Terminate condition: The algorithm will stop after G generation (G is a designed parameter) or the average value of the Fitness function of individuals reaches an converge to the expected value.
k j
of resource need Dk of the service Si that a physical machine Yj requires to supply for service Si:
k j ij ik ik ik ij
K
F (Y ) ( (1 ) )r x (6)
In summary, the genetic algorithm for resource provisioning consists of the following steps:
Algorithm 2
Input: A set of services Si (each service has resource
iYj
need rik corresponding to the type of needs ik) and the
where K (K> 1) is a exponental constant which expresses concentration on the physical machines compared to the lessfilled physical machines. The larger the K is, the quicker the algorithm converges. a few wellfilled physical machines are preferred to a collection about equally filled physical machines. But, leading to premature convergence ability of the algorithm. Experimetally, the optimal value of K is 2, i.e., K = 2. Therefore, the fitness function of the GA for the whole resource need Dk is identified as
d 1 m
set of physical machines Yj corresponding service yield ij.
Output: A set of minimum physical machines yj
(respectively, xij = 1).
The steps of the algorithm:
1: begin
2: t 0;
3: Initializing the first population P(t);
4: Calculating the fitness function of each chromosome in
population P(t);
5: until (the terminate conditions is not satisfied)
fVSMSA Fk (Yj ) (7)
m
k 1
j1
6: t t+1;
7: Selection of P(t) based on roulette mechanism
where m is the number of utilised physical machines.

Crossover Operation: We use a onepoint crossover operator by which two parent chromosomes are fragmented into two segments. Then two new chromosomes are generated by concatenating of chromosome segments fragmented from their parents. Crossover process is applied for all possible pairs of individuals with the probability of pc (pc is the designed parameter).

Mutation Operation: The mutation operation randomly swaps between two services of two different physical machines. Mutation process is applied for every individual with the probability of pm (pm is a design parameter).

Selection of individuals for the next generation: This is a selection P individuals for the next generation. A newly generated chromosome (after

and
fitness value at P(t1);
8: Onepoint crossover operation on P(t) to obtain
Q(t);
9: Mutation operation in the P(t) to obtain R(t); 10: Selection from P(t1)Q(t)R(t) to obtain P(t); 11: end until;
12: end


The computational complexity of the algorithm
The GA algorithm parameters are set before running. Let n be the number of services, m be the number of physical machines, and d be the dimensions of resource. The computational complexity of the algorithm is identified as

Generating the first population: P.O (n.m.d).

Calculating the fitness function: O (n.m.d).

Crossover operation: P.(P 1).O(n.m.d)=P2.O(n.m.d)

Mutation: P.O(n.m.d)

Selection: O (P.logP)
Therefore, the computational complexity of the algorithm is
P.O(n.m.d)+G.(P.(O(n.m.d))+P2.(O(n.m.d+P.logP+O(n.m.
d))+P.O(n.m.d)+O(P.logP))=G.P2.O(n.m.d).




NUMNERICAL RESULTS AND EVALUATIONS
A. Simulation setup
To evaluate the proposed algorithm, a set of experimental random instances are generated based on [9] as follows: Given S services and Y physical machines with D resource dimensions. For each service, the numbers of rigid need are D/2 and the numbers of fluid need are D/2. All resource needs are sampled from a normal probability distribution with mean Âµ and standard deviation . Each service has a QoS request with the probability of .
Assume that the value of the parameters are as follows: Service yield ij = 0.5, numbers of service S = 32, 64, 128, 256, 512, resource dimensions D = 6 (the numbers of rigid need = the numbers of fluid need =3), = 0.5, = 0.25, 0.5, 1.0, = 0.25, 0.5, 1.0 and all QoS requirements to be

(i.e., half of the services fluid needs must be met). Experiments results obtained by using other values of parameters are similar. This setup corresponds to 1Ã—5Ã—1Ã—1Ã—3Ã—3= 45 scenarios. For each scenario, 100 random samples are generated resulting in a total of 4500 input individual instances used for evaluation.
Two metrics were employed for evaluation:

The minimum number of physical machines that corresponding to the 5 values of services S = 32; 64; 128; 256; 512.

The average execution times of the algorithm in seconds. The values of the two metrics are averaged from 900 (i.e., 3 Ã— 3 Ã— 100) experimental instances. The algorithms were coded in C++ language and ran on an Intel Core Duo 1.86 GHz and 2 GB RAM.

We set the size of population P = 100, G = 2000 generations, the probability of mutation pm = 0.1, the probability of crossover pc = 0.25 and constant K = 2. These parameters are estimated empirically based on experience.
C. Simulation results and evaluations
The minimum values of the physical machine (objective function values) and the standard algorithm execution times of vector packing problem are presented in Table I and Table II. The results obtained by using the GA are presented in Table III.
TABLE I. COMPARISON OF MINIMUM PHYSICAL MACHINES
Algorithms
Number of services
S=32
S=64
S=128
S=256
S=512
FirstFitDesMax
24
47
90
174
344
FirstFitDesLex
24
47
90
174
344
FirstFitDesSum
24
47
90
174
344
BestFitDesMax
24
47
90
174
344
BestFitDesLex
24
47
89
170
324
BestFitDesSum
24
47
90
174
344
TABLE II. COMPARISON OF EXECUTION TIMES (S)
Algorithms
Number of services
S=32
S=64
S=18
S=256
S=512
FirstFitDesMax
0.00009
0.00107
0.00303
0.01076
0.03193
FirstFitDesLex
0.00010
0.00114
0.00214
0.00827
0.02529
FirstFitDesSum
0.00016
0.00113
0.00268
0.00932
0.02791
BestFitDesMax
0.00121
0.00254
0.00833
0.03741
0.13107
BestFitDesLex
0.00123
0.00232
0.00783
0.03500
0.12253
BestFitDesSum
0.00128
0.00234
0.00800
0.03693
0.13380
TABLE III. EXECUTION TIMES AND MINIMUM PHYSICAL MACHINES OF GA
Number of
services
S=32
S=64
S=128
S=256
S=512
Execution times
(s)
0.0010
0.00307
0.02076
0.04126
0.08930
Minimum physical machines
24
47
90
174
344
From results shown in Table III and the results obtained from 6 standard algorithms of vector packing problem in Table I and Table II. It is clearly seen that with the same set of data, the execution time of the GA is longer than others. However, in the cases of large number of services (512 services), execution times of GA are shorter than BestFitDesMax, BestFitDesLex, BestFitDesSum algorithms. The objective function values of (number of physical machines) all algorithms are similar (except for BestFixDesLex algorithm with a better value). Moreover, the execution times of the GA are short and therefore it can be applied in practice.


CONCLUSION
The paper has dicussed static resource provisioning based on a homogeneous shared hosting platform for virtual services with the optimal constraints and QoS requirements; each service is considered as a single virtual machine. On the basis of the optimal problem, we have proposed a GA to minimize the number of physical machines. The execution times of the GA are short and therefore it can be applied in practice. For the future work, the proposed model will be extended for heterogeneous platform.
REFERENCES

L. V. Sn, P. N. M. Nhut, Some issuses of providing virtual machines resources base on infrastructure of cloud computing, Journal of Science and Technology, Danang University, 5(11), (2012), pp. 6371. (chnh li tt c Ref. theo nh dng nÃ y !!)

T.T.T Hang, L.T. Vinh, H.C. Thanh, N.T. Thuy, Optimization for multiobjective scheduling in grid computing system, Journal of Computer Science and Cybernetics, Vietnam Academy of Science and Technology, 25 (1) (2009), pp. 7987.

B. Urgaonkar, P. Shenoy and T. Roscoe, Resource Overbooking and Application Profiling in Shared Hosting Platforms, SIGOPS Operating Systems Review, 36(SI), (2002) pp.239254.

M. Aron, P. Druschel and W. Zwaenepoel, Cluster Reserves: A Mechanism for Resource Management in Clusterbased Network Servers, In Proceedings of the 2000 ACM Sigmetrics International Conference on Measurement and Modeling of Computer Systems, New York, USA, (2000), pp.90101.

L. T. Kou, G. Markowsky, Multidimensional Bin Packing Algorithms, IBM Journal of Research and Development, 21(5) (1977) pp.443 – 448.

K. Maruyama, S. K. Chang and D. T. Tang, A general packing algorithm for multidimensional resource requirements, International Journal of Computer and Information Sciences, 6(2) (1977) 131149.

Henri Casanova, David Schanzenbach, Mark Stillwell, FrÃ©dÃ©ric Vivien, Resource provisioning using Virtual Clusters, Research Report No 200833, October 2008.

Christian Blum, Andrea Roli, Metaheuristics in Combinatorial Optimization: Overview and Conceptual Comparison, ACM Computing Surveys, 35(3), (2003), pp. 268308.

Mark Stillwell, David Schanzenbach, FrÃ©dÃ©ric Vivien, Henri Casanova, Resource provisioning algorithms for virtualized service hosting platforms, Journal Parallel Distrib. Comput., 70 (2010) pp.962974.

Holland J., Adaptation in natural and artificial systems, University of Michigan press, Ann Arbor, MI, MIT press, Cambridge, MA, (1975,1992).

Melanie Mitchel, An introduction to genetic algorithms, MIT Press Cambridge, MA, USA, 1996.