- Open Access
- Total Downloads : 251
- Authors : M. Y. Haggag, Heba Nagaty Mohamed
- Paper ID : IJERTV3IS21496
- Volume & Issue : Volume 03, Issue 02 (February 2014)
- Published (First Online): 08-03-2014
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Reliability Estimation and Analysis of DDL MYSQL Server by using Generalized Gamma and Weibull Distribution
M. Y.
Haggag
Heba Nagaty Mohamed
Department of Mathematics, Department of Mathematics Faculty of Science, Al-Azhar University, Faculty of Science, El fayoum University
Abstract – The time between failures for different Operating Systems (Windows and Linux) of DDL MYSQL open source data base server are analyzed and compared. The purpose of this study is to estimate and compare the reliability of two Operating Systems (Windows and Linux) of DDL MYSQL server by using Generalized Gamma and Weibull Distribution which are the best distributions in their rankings. In the result the Reliability Estimation of two Operating Systems are evaluated and compared theoretically and graphically.
-
INTRODUCTION
Software reliability [1] is one of the important parameters of software quality. It is defined as the probability of failure- free software operation in a specified environment for a specified period of time. An earlier researcher as [3] has studied Reliability Estimation and Analysis of Linux Kernel and [4] has studied Estimation and Analysis of MYSQL Database Server Reliability using Beta and Generalized Gamma Distribution.
Maximum Likelihood Estimation .Section 5 show the proposed Methodology Used Weibull++ tool [7] for mathematical and statistical calculation. Section 6 shows the Reliability Evaluation of two Operating Systems. Finally conclusion and References of this paper are shown.
-
BACKGROUND
Software Reliability is defined as: the probability of failure- free software operation for a specified period of time in a specified environment. There are two approaches for prediction of Software Reliability: early stage where reliability estimated during design phase and later stage where reliability estimated during operational stage.
Software Reliability depends upon failure data of the software. Failure behaviour can be represented by various manners such as Probability Density Function (PDF) and Cumulative Distribution Function (CDF) which is derived from PDF and is given by equ. (1):
The Mysql database [5] has become the most popular open source database in the world because of its high
F (t) =P(x t) = t
f x dx
(1)
performance, high reliability and ease of use. The purpose of this study is to compare between the DDL server database in operating systems (Windows and Linux), Where DDL (Data Definition Language) [2] is a language used by a database management system (like Mysql)that allows users to define the database and specify data types,
There for f(t) is the rate of change of F(t). If the random variable T denotes the failure time then F (t) is the probability that the system will fail by time t. Then F(t) is the unreliability function and R(t) is the Reliability function and given by equation (2):
structures and constraints on the data.
R (t) =1- t
f x dx
(2)
To find reliability two types of data can be used: time between failures and fault count. In case of time between failures the input parameter of study is the intervals of successful operations. A probability distribution model
Another function that can be derived from PDF is the failure rate function (Hazard Rate Function) which is defined by equation (3):
whose parameters are estimated by using appropriate mathematical technique reflects the pattern of these
= f(t) =
R(t) 1
f(t)
t
f x dx
(3)
intervals. In case of fault count the input parameter of
study is the number of faults in a specified period of time rather than the times between failures.
In this paper we discussed in the following sections: Section
2 provides some mathematical background of Reliability estimation. Section 3 concentrates on Bug Collection, Bug pre-processing and Bug analysis. Section 4 discussed Goodness of fit test and Parameter Estimation by using
-
BUG COLLECTION, BUG PRE-PROCESSING AND BUG ANALYSIS
The approach to the Reliability estimation of the two operating systems Linux and Windows consists of three steps:
-
Bug Collection: is associated with collecting failure data extracted directly from the following web site http://www.mysql.Bugs.org and Bugs of operating
system Linux are collected from 5/7/2004 to 19/8/2013 and Bugs of operating system Windows are collected from 14/4/2004 to 11/8/2013.
-
Bug preprocessing: in this step such noises are removed.
-
Bug analysis: the preprocessed data is stored in MYSQL database, where MYSQL is an open source database system.
Before applying Goodness of fit test on data collected for each operating system bug frequency corresponding to time
Maximum Likelihood Estimation
Maximum Likelihood Estimation [8] is used to estimate distribution parameters by maximizing the value of Likelihood function. This Likelihood function is based on the probability density function (PDF) for a given distribution, i.e. if (PDF) is f (xi, 1, 2,, k ) ,where x represents the data (times-to-failure) and 1, 2,, k are the parameters which is to be estimated. Then Likelihood function is given by equation (4):
to failure in month is plotted and show in Figure 1, Figure 2. The total of Bugs recorded for Windows are 39, and for
n
L =
i=1
f (xi, 1, 2, , k)
(4)
Linux are 75.
Where n: is the number of failure data points. Then taking
log-likelihood function which is defined by equation (5):
100
BUG
50 FREQUENCY
0
0 10
i=1
ln L = n ln f (xi, 1, 2, , k ) (5)
Finally, parameters are estimated by using the following partial derivatives given by equation (6):
50
40
30
20
ln L = 0, j = 1,2, , k (6)
MONTH
j
Figure 1. Monthly Bug Frequency of Windows
100
BUG
50 FREQUENCY
0
0 20
MONTH
60
80
40
Figure 2. Monthly Bug Frequency of Linux
-
GOODNESS OF FIT TEST AND PARAMETER ESTIMATION
The Goodness of fit test is used to identify whether of the following distributions which are the most commonly used life distributions is suitable for collecting data or not.
-
1 and 2 parameter exponential distributions.
-
1, 2 and 3 parameter Weibull distributions.
-
Normal distribution.
-
Lognormal distribution.
-
Generalized Gamma (G-Gamma) distribution.
-
Logistic distribution.
-
Log Logistic distribution.
-
Gumbel distribution.
There are a lot of methods for The Goodness of fit test but the method of Maximum Likelihood Estimation is considered as the best method of Parameter estimation.
-
-
THE PROPOSED METHODOLOGY
In this research Weibull++ tool is used for mathematical and statistical calculation.
First: Parameters for all life data distribution are estimated by maximum likelihood estimation and presented in the following table (1) and table (2) for two operating system Linux and Windows respectively.
Second: Calculate Log-Likelihood Function By using parameters estimation and presented in the following table
-
and table (4) for two operating system Linux and Windows respectively.
Third: A distribution having maximum LKV is considered as best distribution fitted the given data.
p>Table 1 . Parameter Estimation for Linux
Distribution
Parameters
Exponential 1
=48.742639
Exponential 2
=46.8096,=1.933
Weibull 2
=2.32103,=54.57108
Weibull 3
=6.167074,=121.856165 ,=-64.2574
Normal
=48.74263,std=21.99186
Lognormal
mean=3.71399,Std=0.71288
G-Gamma
=4.222010,=0.255851, =2.583114
Gamma
=2.77007,k=3.05409
Logistic
=49.90922,=12.8883
Log-Logistic
=3.8324,=0.351556
Gumbel
=59.25009,=18.792409
Table 2. Parameter Estimation for Windows
in [9] to determine the Weibull three parameters Distribution
-
Construction of Reliability Model using the Weibull Distribution
-
The Probability Density Function of Weibull Distribution is given by:
Distribution
Parameters
Exponential 1
=43.56212
Exponential 2
=38.16212,=5.400
Weibull 2
=1.96948,=49.16702
Weibull 3
=1.89676,=47.79413 ,=1.305
Normal
=43.56212,std=23.41969
Lognormal
mean=3.58866,Std=.67598
G-Gamma
=4.28027,=0.226218, =3.640866
Gamma
=2.72656,k=2.85087
Logistic
=43.05427,=14.122369
Log-Logistic
=3.647028,=0.386143
Gumbel
=55.23509,=21.667332
f T = (T)1e(T) ,
> 0, > 0
Where
(7)
Table 3 . Log-Likelihood Value for Linux
is the shape parameter, also known as the Weibull slope.
is the scale parameter
is the location parameter
The cumulative Distribution Function of Weibull Distribution is given by:
Distribution
LKV
Rank
G-Gamma
-332.51
1
Weibull 3
-335.93
2
Gumbel
-336.9
3
Normal
-337.7
4
Logistic
-339.9
5
Weibull 2
-340
6
Gamma
-347.1
7
Log-Logistic
-353
8
Lognormal
-359.08
9
Exponential 2
-363.4
10
Exponential 1
-366.4
11
(T )
F T = e (8)
The Reliability Function of Weibull Distribution is given by:
( )
= 1 (9)
Table 4 . Log-Likelihood Value for Windows
By substituting from Table (1), Table (2) of Parameter Estimation in the equations (7), (9), we get:
f T =
47.79413
1.89676 ( T1.305 ).89676 e( T 1.305 )1.89676 (Windows)
Distribution
LKV
Rank
G-Gamma
-178.6
1
Weibull 3
-180.5
2
Weibull 2
-180.79
3
Gamma
-181.67
4
Normal
-182.4
5
Lognormal
-184.14
6
Logistic
-184.28
7
Gumbel
-184.58
8
Log-Logistic
-184.68
9
Exponential 2
-185.6
10
Exponential 1
-190.9
11
47.79413 47.79413
f T =
6.167074 (T 64.2574 )5.167074 e(T 64.2574 )6.167074 (Linux)
121 .85616
and
121 .85616
( T 1.305 )1.89676
121 .85616
(10)
R T = 1 e
47.79413
(Windows)
121 .85616 )
= 1
( 64.2574 6.167074 ()
From the previous tables (3), (4), its clear that Generalized Gamma and Weibull Distribution is best suited and may be considered for reliability estimation.
We used the web site in [8] to determine the Generalized Gamma three parameters Distribution. We used the web site
(11)
B. Construction of Reliability Model using Generalized Gamma Distribution:
The Probability Density Function of Generalized Gamma Distribution is given by:
-
-
RELIABILITY EVALUATION:
It is clear from the goodness of fit section that best distribution appropriate for collected sample are
f T =
Where
(k).
T
k1
T
e , > 0, > 0, > 0 (12)
Generalized Gamma Distribution with three parameters and Weibull Distribution with three parameters.
In this section the PDF and Reliability of Generalized
are the shape parameter. is the scale parameter.
Gamma Function has the formula:
0
x = tx1 etdt
But, Weibull++ uses a reparameterization with parameters k, and as shown in the following:
-
= ln + 1 . ln 1 , > 0 is the location
Gamma Distribution with three parameters and Weibull Distribution are evaluated in the following tables (5),(6) by using equations (10),(11),(16),(17) ,and the corresponding graphs of the PDF and Reliability for each distributions are show in the following graphs (3),(4),(5),(6).
Table (5) The PDF of Weibull Distribution and G-Gamma Distribution for Windows &Linux
2
Month No.
Windows
Linux
f(t)-G-Gamma
f(t)- weibull
f(t)-G-Gamma
f(t)- weibull
5
0.008114516
0.003965289
0.0046843
0.002649
10
0.009412931
0.008275655
0.006685
0.003735
15
0.010266748
0.011784083
0.008231
0.00511
20
0.010919108
0.014449878
0.0095402
0.006786
25
0.01145353
0.016241065
0.0106974
0.008749
30
0.011909544
0.0171763
0.0117461
0.010935
35
0.012309227
0.017326563
0.0127111
0.013224
40
0.012666219
0.016805063
0.0136055
0.015426
45
0.012989341
0.015752378
0.0144304
0.017289
50
0.013283552
0.01432049
0.015166
0.018526
55
0.013547534
0.012658195
0.015759
0.018864
60
0.01376302
0.010899465
0.0160854
0.018121
65
0.013862474
0.009155594
0.0159207
0.016285
70
0.013647303
0.007511239
0.0149169
0.013558
75
0.012631968
0.006023932
0.012689
0.010341
80
0.009967771
0.004726287
0.009142
0.007134
85
0.005317035
0.00362999
0.0049902
0.004388
88
0.002481168
0.003067554
0.0028354
0.003082
Average
0.011030166
0.010342088
0.011102
0.010346
parameter.
1
-
= , > 0 is the scale parameter.
k
-
= 1
k
is the shape parameter. (13)
The Cumulative Function of Generalized Gamma Distribution is given by:
F T =
t
(k, )
(k)
(14)
The Reliability Function of Generalized Gamma Distribution is given by:
t
R T = (k,
(k)
(15)
By substituting from Table (1), Table (2) of Parameter Estimation in the equations (13), we get the following values in two operating systems (Windows and Linux) , respectively
K = .0754383, = 16.094475, = 84.8469
(Windows)
k = 0.149869, = 10.096149, = 82.269615
(Linux) And then we get the following equations (16):
(.0754383 ,( t )16.094475 )
f T = 84.846 9
0.0754383
(.149869 ,( t )10.096149 )
f T = 82.269615
0.149869
(Windows)
(Linux) (16)
PDF-f(t)
And
R T = 1
16.094475
T 14.880335
T 16.094475
.0754383
R T = 1
10.096149
84.8469
T
8.58304
e 84.8469
T 10.096149
(Windows)
f(t) in
Windows f(t) in linux
0.02
0.015
0.01
0.149869
82.269615
e 82.269615
(Linux)
(17)
0.005
0
5 15 25 35 45 55 65 75 85 90
Month NO.
Figure 3. PDF of Weibull Distribution for Windows &Linux
5 15 25 35 45 55 65 75 8M5onth NO.
R(t) in
Windows
R(t) in Linux
Reliability
1.2
1
0.8
0.6
0.4
0.2
0
Month NO.
5 15 25 35 45 55 65 75 85
f(t) in
Windo ws
PDF f(t)
0.02
0.015
0.01
0.005
0
Figure 4. PDF-of G-Gamma Distribution for Windows &Linux
Table 6. The Reliability of Weibull and G-Gamma Distribution for Windows &Linux
Figure (6) show compare between the Reliabilities of Windows and Linux by using G-Gamma Distribution
7. CONCLUSIONS:
Month No. |
Windows |
Linux |
||
R(t)-G- Gamma |
R(t)- weibull |
R(t)-G- Gamma |
R(t)- weibull |
|
5 |
0.966583 |
0.99224524 |
0.984521 |
0.969795 |
10 |
0.922473 |
0.96130479 |
0.95582 |
0.953951 |
15 |
0.87316 |
0.91081577 |
0.918403 |
0.931965 |
20 |
0.820134 |
0.84486975 |
0.873899 |
0.90235 |
25 |
0.764164 |
0.76777766 |
0.823253 |
0.863623 |
30 |
0.705729 |
0.68388866 |
0.767105 |
0.814486 |
35 |
0.645162 |
0.59732532 |
0.705931 |
0.754098 |
40 |
0.582708 |
0.51174416 |
0.640111 |
0.68239 |
45 |
0.518556 |
0.43016062 |
0.56999 |
0.600403 |
50 |
0.452862 |
0.35485219 |
0.495953 |
0.510544 |
55 |
0.38577 |
0.28733889 |
0.418559 |
0.416648 |
60 |
0.317465 |
0.22842969 |
0.3388 |
0.323717 |
65 |
0.248327 |
0.17831797 |
0.258517 |
0.237271 |
70 |
0.179349 |
0.13670636 |
0.180988 |
0.162362 |
75 |
0.113162 |
0.10294244 |
0.111402 |
0.102512 |
80 |
0.0557885 |
0.07614977 |
0.0563443 |
0.058935 |
85 |
0.0169778 |
0.0553431 |
0.0210462 |
0.030398 |
88 |
0.00544105 |
0.04531424 |
0.00943495 |
0.019264 |
Average |
0.476322853 |
0.43184461 |
0.507226525 |
0.492029 |
In the above study a detail methodology to estimate reliability of two Operating System (Windows and Linux) are discussed and it has been analyzed by using two fitted Distributions: Weibull Distribution and G-Gamma Distribution.
The average value of Probability Density Function and Reliability of (Weibull Distribution and G-Gamma Distribution) for each Operating System are calculated, and it has shown that:
Operating system Linux is most Reliable than Operating system Windows for each two Distributions Weibull Distribution and G-Gamma Distribution.
REFERENCES:
1.2
1
0.8
0.6
0.4
0.2
0
Reliability
5 15 25 35 45 55 65 75 85 90
R(t) in windows
R(t) in linux
Month NO.
-
Lyu, Michael R. "Handbook of software reliability engineering." (1996).
-
IEEE Reliability Society, IEEE recommended practice on software reliability, IEEE Std 1633-2008, June 2008.
-
Sanjeev Kumar Jha, Dr. A.K.D.Dwivedi, Dr. Amod Tiwari. Reliability Estimation and Analysis of Linux Kernel(2011) IJCST Vol. 2, Issue 2
-
Jha, Sanjeev Kumar, Pankaj Kumar, and A. K. D. Dwivedi. Estimation and Analysis of MYSQL Database Server Reliability using Beta and Generalized Gamma Distribution. (IJCET) Journal Impact Factor 3.2 (2012):pp. 354-371.
-
Operating Systems Source of Bugs: http://bugs.mysql.com
-
F. S. G. RICHARDS, A Method of Maximum-likelihood Estimation, http://www.jstor.org/pss/2984037
-
Weibull++, Reliability Function, [Online] Available: http://www.weibull.com and http://www.reliasoft.com
-
en.wikipedia.org/wiki/Generalized_gamma_distribution
-
http://reliawiki.org/index.php/The_Weibull_Distribution
-
H. Pham, Software Reliability. Springer Verlag, 2000.
-
Gavin E. Crooks (2010), The Amoroso Distribution, Technical Note, Lawrence Berkeley National Laboratory
-
Guo, Huairui, Jin, Tongdan, and Mettas, Adamantios. "Design Reliabilty Demonstration Tests for One-Shot Systems Under Zero Component Failure," IEEE Transactions on Reliability, Vol. 60, No. 1, pp. 286-294, (March 2011).
Figure 5. Show compare between the Reliabilities of Windows and Linux by using Weibull Distribution