 Open Access
 Total Downloads : 407
 Authors : B. A Ganesh, Rama Janaki Ramireddy, N. K.Santosh, S. Ram Prasad Reddy
 Paper ID : IJERTV3IS10252
 Volume & Issue : Volume 03, Issue 01 (January 2014)
 Published (First Online): 23012014
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Performance Evaluation Of Linear Regression And Neural Networks On Forecasting Numerical Data Sets
1 B. A Ganesh, Asst professor in Department of computer science and engineering,
Vignans Institute of Engineering for Women, Visakhapatnam.
2 Rama Janaki Ramireddy, Asst professor in Department of CSE,
Vignans Institute of Engineering for Women, Visakhapatnam.
3 N. K. Santosh ,Asst professor in Department of computer science and engineering,
Vignans Institute of Engineering for Women, Visakhapatnam.
4 S. Ram Prasad Reddy, Assoc. professor in Department of CSE
Vignans Institute of Engineering for Women, Visakhapatnam.
Abstract
Data mining is a subject which deals with knowledge discovery from a huge set of repositories. It is the confluence of different subjects like statistics, machine learning, artificial intelligence, etc. There are different functionalities which are provided by data mining. Prediction is one of the functionality of data mining which can be done by using regression techniques and neural networks. This work is a study on the performance evaluation of the regression and neural networks models on baby weight forecasting. The study was conducted between linear regression model and neural network. Neural network was constructed by using the back propagation algorithm with one input node and one output node with one hidden layer. The comparative analysis study was done on different data sets based on the correlation of the data.
KEYWORDS: LINEAR REGRESSION, NEURAL NETWORKS, PERFORMANCE EVALUATION

INTRODUCTION
Linear regression is an approach to modeling the relationship between scalar dependent variable y and one or more explanatory variables denoted x. The case of one explanatory variable is called simple linear regression while for more than one explanatory variable; it is termed as multiple linear regression. The linear regression model can be represented in the form of
Y= a+bx+e
Where the x is an independent variable and the "residual" e is a random variable with mean zero. The coefficients a and b are determined by the condition that the sum of the square residuals is as small as possible.
Figure 1
A neural network is a collection of input and output units. The connections between the units have associated weights. During the learning stage, the weights were adjusted so that the correct class label can be predicted. Hence it is also referred to as connectionist learning. The algorithm that is used to train the network in this system is back propagation algorithm.
Figure 2: Simple Neural Network Model

Related work
The literature so far, comprises of papers briging out the comparision of these two models by considering the specific appications such as comparison on defect color data on CRT color display [1]; comparison of rainfall data [2]; comparison of trafic accidents data [3]. The comparisons were carried out by considering the statastical measures like MSE(mean square error), R2 values and correlation coffiecients..This research was conducted on general numerical data by considering the correlation among the attribute values.

Datasets and methods:

Baby weight data set
For forecasting the baby weight at different stages , Gestation period (in weeks) and weight (in ounce) are considered as attributes. Gestation is the period during which an embryo develops (about 266 days in humans). The average human gestation length is calculated as 40 weeks. Childbirth occurring anywhere between 3742 weeks is considered to be normal, whereas a baby born before 37 weeks is called preterm and the baby born after 42 weeks is called postterm.
ATTRIBUTE NAME
MEASURE
GESTATION PERIOD
IN WEEKS
WEIGHT
OUNCE


Temperature data set
In the data set of temperature, Celsius and Fahrenheit are the attributes and following is the conversion formula from Celsius to Fahrenheit
Â°C x 9/5 + 32 = Â°F
ATTRIBUTE NAME
MEASURE
TEMPARATURE
CELSIUS AND FARENHEIT

Input and Output variable choice:
In case of neural network, we have considered one input and one output, where gestation period in weeks and temperature in Celsius are given as input in different cases and the output variables are weight and temperature in Fahrenheit. Similarly in the case of linear regression we have taken gestation period and temperature in Celsius are the independent variables and weight and temperature in Fahrenheit as dependent variables.

BackPropagation Learning:
The backpropagation algorithm has emerged as the workhorse for the design of a special class of layered feed forward networks known as multilayer perceptrons (MLP) [4]. A multilayer perceptron has an input layer of source nodes and an output layer of neurons (i.e., computation nodes); these two layers connect the network to the outside world. In addition to these two layers, the multilayer perceptron usually has one or more layers of hidden neurons, which are so called because these neurons are not directly accessible. The hidden neurons extract important features contained in the input data.The training of an MLP was accomplished by using a BACK PROPAGATION (BP) algorithm.
Back propagation learns by iteratively processing a set of training sample, comparing the networks prediction for each sample with the actual known value. For each training sample, the weights are modified so as to minimize the mean squared error between the networks prediction and the actual value.
These modifications are made in the backwards direction, that is, from the output layer, through each hidden layer down to the first layer.

Initialize the weights:

The weights in the network are initialized to small random number (e.g., ranging from 1.0 to 1.0 or –
0.5 to 0.5). Each unit has a bias associated with it. The biases are similarly initialized to small random numbers.

Propagate the inputs forward
In this step, the net input and output of each unit in the hidden and output layers are computed.[4] First the training sample is fed to the input layer of the network. Note that for unit j in the input layer, its output is equal to its input, that is, Oj = Ij for input unit j. The net input to each unit in the hidden and output layers is computed as a linear combination of its inputs. The inputs to the unit are, in fact, the outputs of the units connected to it in the previous layer. To compute the net input to the unit, each input connected to the unit is multiplied by its corresponding weight, and this is summed. Given a unit j in a hidden or output layer, the net input, Ij, to unit j is
Ij=iWijOij,
Where wij is the weight of the connection from unit i in the previous layer to unit j; Oi is the output of unit i from the previous layer; and j is the bias of the unit. The bias acts as a threshold in that is serves to vary the activity of the unit.
Each unit in the hidden and output layers takes its net input and then applies an activation function to it[4]. The function symbolizes the activation of the neuron represented by the unit. The sigmoid function is used. Given the net input Ij to unit j, the Oj, the output of unit j, is computed as
j = 1
1+eIj
This function is also referred to as a squashing function, since it maps a large input domain onto the smaller range of 0 to 1. The logistic function is non linear and differentiable, allowing the model classification problems that are linearly inseparable.

Back propagate the error
The error is propagated backwards by updating the weights and biases to reflect the error of the networks prediction [4]. For a unit j in the output layer, the error Errj is computed by
Errj = Oj (1Oj) (TjOj)
Where Oj is the actual output of unit j, and Tj is the true output and Oj (1Oj) is the derivative of the logistic function.
The error of a hidden layer unit j is
Errj=Oj(1Oj)kErrkWjk,
Where wjk is the weight of the connection from unit j to a unit k in the next higher layer, and Errk is the error of unit k.
The weights and biases are updated to reflect the propagated errors. Weights are updated by the following equations, where wij is the change in weight wij:
wij = (l) ErrjOi wij = wij+wij
The variable l is the learning rate, a constant typically having a value between 0.0 and 1.0. Back propagation learns using a method of gradient descent to search for a set of weights that can model the given classification problem so as to minimize the mean squared distance between the networks class prediction and the actual class label of the samples. The learning rate helps to avoid getting stuck at a local minimum at a local minimum in decision space and encourages finding the global minimum. If the learning rate is too small, then learning will occur at a very slow pace. If the learning rate is thumb is to set the learning rate to 1/t, where t is the number of iterations through the training set so far.
Biases are updated by the following equations below, where j is the change in bias j:
j = (l) Errj j = j+j
Weights and biases are updated after the presentation of each sample, referred to as case updating. Alternatively, the weight and bias increments could be accumulated in variables, so that the weights and biases are updated after all of the samples in the training set have been

Terminating condition
Training stops when

all wij in the previous epoch were so small as to be below some specified threshold, or

the percentage of samples misclassified in the previous epoch is below some threshold, or

A prespecified number of epochs have expired.


Linear regression
In linear regression, data are modeled using a straight line. Linear regression is the simplest form of regression. Bivariate linear regression models a random variable, Y (called response variable), as linear function of another random variable, X(called a predictor variable),that is,
relationships among variables. The correlation coefficient is a measure of linear association between two variables. Values of the correlation coefficient are always between 1 and +1.

If value is 1 then all the data points lie perfectly along a straight line with positive slope.

If value is 1 then also all the data points lie perfectly along a straight line but with a negative slope.

Research approach and objective:
By considering the hypothetical statement The correlation among the independent and dependent attributes influences the technique that adopted for predicting the future values and proved the above statement with the help of temparature data set. Since the correlation coefficient for celsius and farenheit is exactly one the linear regression is
Since in the acual data set the gestation period and baby weight are correlated but not having the correlation coefficient value exactly one so,comparative analysis is made on two techniques

linear regression and neural networks for
presented.This latter strategy called epoch updating, where a single iteration through the training set is an epoch. In theory, the mathematical derivation of back propagation employs epoch updating,
Y= +X,
Where the variance of Y is assumed to be constant, and are regression coefficients specifying the Y intercept and slope of the line, respectively. These coefficients can be solved for by the method of least squares, which minimizes the error between the actual data and the estimate of the line.
The least squares estimator a, b for , respectively in the method of least squares are:
b = nxyxy nx2(x)2
a = y – bx n
where x, y values represent the attribute values.
Correlation and regression analysis are related in the sense that both deal with

A value that is close to either +1 or 1 signifies clustering of the data points around a straight line.

If the value of correlation is 0, it indicates the presence of nonlinearity.

Corelation coffiecient:
r = n(xy)(x)(y)
[ n x2(x)2][ny2(y)2]here X,Y are the corresponding attributes
performing better than neural networks but not all attributes and data sets will have correlation coefficient equals to exactly one and it ranges from 1 to 1.Hence a question raises regarding the performance of technique adopted for prediction of unknown values different data sets. forecasting the baby weight in order to evaluate the performance.
In this paper work we conducted two experiments .the first experiment was to prove that correlation influences technique adopted for forecasting and the second experiment was conducted to evaluate the performance of the adopted techniques.The results were verified and tabulated.


Experimental Results
Experiment 1: comparing the linear regression output with neural network output for temparature data set whose r=1
Table 1: Celsius Vs Farenheit Temparature
Celsius
Farenheit
Linear regression Output
Neural network output
Difference b/w Linear regression output and Actual output
D i ff er en ce b e tw ee n N e u ral ne tw or k a n d a ct ua l o ut p ut.
101
213.8
213.8
208.65
1.53E05
5.15
102
215.6
215.6
209.89
1.53E05
5.71
103
217.4
217.4
211.1
3.05E05
6.3
104
219.2
219.2
212.3
3.05E05
6.9
105
221
221
213.48
3.05E05
7.52
106
222.8
222.8
214.64
3.05E05
8.16
107
224.6
224.6
215.78
1.53E05
8.82
108
226.4
226.4
216.91
0
9.49
109
228.2
228.2
218.01
1.53E05
10.19
110
230
230
219.1
1.53E05
10.9
111
231.8
231.8
220.17
1.53E05
11.63
112
233.6
233.6
221.22
3.05E05
12.38
113
235.4
235.4
222.25
1.53E05
13.15
114/p>
237.2
237.2
223.27
1.53E05
13.93
115
239
239
224.26
1.53E05
14.74
T he T ab l e 1 s ho ws t ha t whe n t he co rre l at io n coffi c i e nt i s o ne, t he di ffer e nc e b et we e n Li ne ar r e gr e s sio n a nd i t s ac t ua l o ut p ut va l ue i s 0 . He nc e i t i s s ho wn t ha t l i ne ar re gre s s io n i s mo r e a cc ura t el y pr ed ic t s t he o utp ut va l ue

Result analysis for temparature data set :
Fro m t he p lo tt ed fi gur e s 1 ( a) a nd 1( b) , it i s cl ea r t hat fo r t he da t a s et who s e c orr el at io n co e ffi ci e nt i s e xa c tl y o ne , bot h t he t ec hni q ue s ar e wo r ki ng good b ut co mp ar a ti vel y l i nea r r e gr e s sio n is pe r for mi ng b et te r t ha n ne ura l ne t wo r ks .
Figure 1(a)
Figure 1(b)
Experiment 2: comparing the linear regression output with neural network output for baby weight whose r1
When the correlation coefficient is not equal to one between the attribute values, the comparision is as shown in the table 2.
Table 2: Gestation period Vs weight
Gestation period
weight
Linear regression output
Neural network output
Difference b/w Linear regression output & Actual output
D i ffe re n ce
b e tw ee n N eu ra l n e tw or k &
a ct ua l o ut p ut
21
14
30.44
17.9
16.4473
35.59
22
15
35.20
16.2
20.20884
37.42
26
39
54.25
41.3
15.25499
25.91
32
93.5
82.82
86.7
10.67577
6.8
33
92
87.58
90.62
4.41423
1.38
35
99.1
92.34
94.6
17.01397
19.266
36
106
97.10
98.64
2.002266
0.4711
37
112
101.8
102.7
4.52961
3.68
38
110
106.6
106.8
5.653786
5.4357
39
119
111.3
111
0.960121
0.5666
40
123
116.1
115.1
3.147331
4.1223
41
121
120.9
119.3
2.352699
3.8992
42
132
125.6
123.5
4.223549
2.1054
43
121
130.4
127.7
1.893723
4.5833
44
100
135.2
131.9
14.20116
10.94
Table 2 shows The difference between the neural network out put and actual value is less when compared to its counter part.the Network architecture for this neural network is 221.When the corelation coffiecient is equal to one,the results are in the following manner.

Result analysis for baby weight data set:
F ro m t he plo t ted fi gur es 2( a) a nd 2 ( b) , for t hi s da ta se t who s e c orr el ati o n co e ffi ci e nt i s not e xa ct l y o ne, it i s ob ser ved t ha t ne ura l net wo rk i s per for mi ng b e tt er tha n l i near re gr e s si o n.
Figure 2(a) Figure 2(b)


Conclusion
From both the experiments we can conclude that Neural network performs well and gives better prediction values in all the cases irrespective of the correlation among the attributes of different data sets whereas linear regression produces good prediction when the correlation value is one.

References

Mauridhi Hery PURNOMO,Toshio ASANO,EiJJi SHIMIZU, A Comparative Study of Neural Network Approach and Linear Regression for Analysis of
Multivariate Data of the Defect Color on the Color CRT Displays,Mem. Fac. Eng., Osaka City Univ., Vol. 38, pp.1522.

A. Elshafie,M. Mukhlisin, Ali A. Najah and M.
R. Taha,Performance of Artificial NeuralNetwork and regression Techniques for rain fall round off prediction, International Journal of the Physical Sciences Vol. 6(8), pp. 19972003.

DR GALAL A ALI and DR CHARLES S.BAKHEIT Comparitive Analasyis of Traffic Accidents in SudanUsing Neural Networks and Statistical Methods

M.R NarasingaRao, G.R. Sridhar, K.Madhu, A.A.Rao, A Clinical Decesion Support system Using Multilayer Perception Neural Network to Assess Wellbeing in Diabetes, Journal of Association of Physicians India,Volume 57.