 Open Access
 Total Downloads : 813
 Authors : M.V Subbarao, B.Revanth, D.Udaykumar
 Paper ID : IJERTV2IS100401
 Volume & Issue : Volume 02, Issue 10 (October 2013)
 Published (First Online): 12102013
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Mri Brain Image Classification Using Probabilistic Neural Network And Tumor Detection Using Clustering Technique
1M.V SubbaRao Assistant Professor
2B.Revanth
Assistant Professor
3D.UdayKumar
Assistant Professor
4B.CH.S.N.L.S Saibaba Assistant Professor
5K.RupendraSingh
Assistant Professor
Abstract:The paper proposes an automatic support system for stage classification using Probabilistic neural network and to detect Brain Tumor through clustering methods for medical application. The detection of the Brain Tumor is a challenging problem, due to the structure of the Tumor cells. This paper presents a segmentation method, KMeans clustering algorithm, for segmenting Magnetic Resonance images to detect the Brain Tumor in its early stages [5]. The artificial neural network will be used to classify the stage of Brain Tumor that is benign, malignant or normal [7]. The manual analysis of the sputum samples is time consuming, inaccurate and requires intensive trained person to avoid diagnostic errors [2]. The segmentation results will be used as a base for a Computer Aided Diagnosis (CAD) system for early detection of Brain Tumor which will improves the chances of survival for the patient. The experimental result shows that the Clustering based segmentation results are more accurate and reliable than Thresholding and clustering methods in all cases [6]. Probabilistic Neural Network with image and data processing techniques was employed to implement an automated Brain Tumor classification. Decision making was performed in two stages: feature extraction using GLCM and the classification using Probabilistic Neural Network (PNN)[1]. The performance of the PNN classifier was evaluated in terms of training performance and classification accuracies. Probabilistic Neural Network gives fast and accurate classification than other neural networks and it is a promising tool for classification of the Tumors [3].
Keywords: Probabilistic Neural Network, Clustering, classification, Segmentation.

Introduction
Automated classification and detection of tumors in different medical images is motivated by the necessity of high accuracy when dealing with a human life. Also, the computer assistance is demanded in medical institutions due to the fact that it could improve the results of humans in such a domain where the false negative cases must be at a very low rate. It has been proven that double reading of medical images could lead to better tumor detection. But the cost implied in double reading is very high, thats why good software to assist humans in medical institutions is of great interest nowadays. Conventional methods of monitoring and diagnosing the diseases rely on detecting the presence of particular features by a human observer. Due to large number of patients in intensive care units and the need for continuous observation of such conditions, several techniques for automated diagnostic systems have been developed in recent years to attempt to solve this problem. Such techniques work by transforming the mostly qualitative diagnostic criteria into a more objective quantitative feature classification problem [1].

Methodology
In this paper the automated classification of brain magnetic resonance images by using some prior knowledge like pixel intensity and some anatomical features is proposed [1]. Currently there are no methods widely accepted therefore automatic and reliable methods for tumor detection are of great need and interest. The application of PNN in the classification of data for MR images problems are not fully utilized yet [5]. These included the clustering and classification techniques especially for MR images problems with huge scale of data and
consuming times and energy if done manually. Thus, fully understanding the recognition, classification or clustering techniques is essential to the developments of Neural Network systems particularly in medicine problems. Segmentation of brain tissues in gray matter, white matter and tumor on medical images is not only of high interest in serial treatment monitoring of disease burden in oncologic imaging, but also gaining popularity with the advance of image guided surgical approaches[6]. Outlining the brain tumor contour is a major step in planning spatially localized radiotherapy (e.g., Cyber knife, iMRT) which is usually done manually on contrast enhanced T1weighted magnetic resonance Images (MRI) in current clinical practice. On T1 MR Images acquired after administration of a contrast agent (gadolinium), blood vessels and parts of the tumor, where the contrast can pass the bloodbrain barrier are observed as hyper intense areas. There are various attempts for brain tumor segmentation in the literature which use a single modality, combine multi modalities and use priors obtained from population atlases.

Existing System
There are some approaches for image segmentation,

Thresholding and

Manual analysis
The simplest method of image segmentation is called the thresholding method. This method is based on a cliplevel (or a threshold value) to turn a grayscale image into a binary image. The key of this method is to select the threshold value (or values when multiple levels are selected). Several popular methods are used in industry including the maximum entropy method, Otsu's method (maximum variance), and kmeans clustering. Recently, methods have been developed for thresholding computed tomography (CT) images. The key idea is that, unlike Otsu's method, the thresholds are derived from the radiographs instead of the (reconstructed) image.
A. Drawbacks

Difficult to get accurate results

Not applicable for multiple images for Tumor detection in a short time

Medical Resonance images contain a noise
caused by operator performance which can lead to serious inaccuracies classification [5].


Proposed System
MRI Brain Image Classification and Tumor Detection Is Proposed Based On,
Network
for
classification.
(K means)
for
effective Segmentation.

Probabilistic Neural

Clustering Algorithm

Segmentation
Segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as super pixels)[6]. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics.
The result of image segmentation is a set of segments that collectively cover the entire image, or a set of contours extracted from the image (see edge detection). Each of the pixels in a region is similar with respect to some characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristic(s) when applied to a stack of images, typical in medical imaging, the resulting contours after image segmentation can be used to create 3D reconstructions with the help of interpolation algorithms like Marching cubes[6].

It can segment the Brain regions from the image accurately.

It is useful to classify the Brain Tumor images for accurate detection.

Brain Tumor will be detected in an early definedc asn 2
stages
J
x ( j ) m
c k j
c k j
j 1 k 1


Clustering
Clustering can be considered the most important unsupervised learning problem, so, it deals with finding a structure in a collection of unlabeled data. A cluster is therefore a collection of objects which are similar between them and are dissimilar to the objects belonging to other clusters [10]
Clustering algorithms may be classified as listed below

Exclusive Clustering

Overlapping Clustering

Hierarchical Clustering

Probabilistic Clustering
In the first case data are grouped in an exclusive way, so that if a certain datum belongs to a definite cluster then it could not be included in another cluster. On the contrary the second type, the overlapping clustering, uses fuzzy sets to cluster data, so that each point may belong to two or more clusters with different degrees of membership. In this case, data will be associated to an appropriate membership value. A hierarchical clustering algorithm is based on the union between the two nearest clusters [10]. The beginning condition is realized by setting every datum as a cluster. After a few iterations it reaches the final clusters wanted


KMeans Clustering
Cluster analysis, an important technology in data mining, is an effective method of analyzing and discovering useful information from numerous data. Cluster algorithm groups the data into classes or clusters so that objects within a cluster have high similarity in comparison to one another, but are very dissimilar to objects in other clusters [10]. Dissimilarities are assessed based on the attribute values describing the objects. Often, distance
measures are used. As a branch of statistics and
where J, is the sum of squareerror for all objects in the database, xk is the point in space representing a given object, and mj is the mean of cluster Cj . Adopting the squarederror criterion, Kmeans works well when the clusters are compact clouds that are rather well separated from one another and are not suitable for discovering clusters with no convex shapes or clusters of very different size. For attempting to minimize the squareemir criterion, it will divide the objects in one cluster into two or more clusters. Aiming at the dependency to initial conditions and the limitation of Kmeans algorithm that applies the squareerror criterion to measure the quality of clustering, this paper presents a new improved Kmeans algorithm that is based on effective techniques of multi sampling and onceclustering to search the optimal initial values of cluster centers. Our experimental results demonstrate the new algorithm can obtain better stability and excel the original Kmeans in clustering results

Pseudo Code For KMeans
In this section, we briefly describe the original Kmeans algorithm.
Original Kmeans(s,k), s={x1, x2, ..xn}.
Input: the number of clusters K and a dataset containing n objects (xi).
Output: a set of k clusters cj that minimize the squarederror criterion
Begin m=1;
Initialize k prototypes Zj , j[1,K]; repeat
for i=1 to n do Begin
for j=1 to k do
compute D(Xi,Zj)=  Xi – Zj ; if D(Xi,Zj) min { D(Xi,Zj) }
k
k
then Xi Cj ; end;
if m=1 then Jc (m) =  Xi – Zj 2 ;
an example of unsupervised learning, clustering
m=m+1
J (m) X Z
J (m) X Z
For j=1 to NkJ do
j 1 xi cj
provides us an exact and subtle analysis tool from the mathematic view Kmeans algorithm belongs to a popular partition method in cluster analysis. The most widely used clustering error criterion is squarederror criterion, it can be
Zj = 1/nj K x ( f ) Nji=1xi(f) 2
i
i
c j 1 i j ;
c
c
Until Jc (mj 1) xi J cj (m1) < End
The computational complexity of original K means algorithm is O(ndk), where n is the total number of objects, k is the number of clusters, and d is the dimensions of datasets.

Algorithm Flow diagram


System Architecture
As the complexity of systems increases, the specification of the system decomposition is critical. Moreover, subsystem decomposition is constantly revised whenever new issues are addressed. Subsystems are merged into alone subsystem, a complex subsystem is split into parts, and some subsystems are added to take care of new functionality.
Texture Analysis
Texture is that innate property of all surfaces that describes visual patterns, each having properties of homogeneity. It contains important information about the structural arrangement of the surface, such as; clouds, leaves, bricks, fabric, etc. It also describes the relationship of the surface to the surrounding environment. In short, it is a feature that describes the distinctive physical composition of a surface.
Texture properties include:

Coarseness

Contrast

Directionality

Linelikeness

Regularity

Roughness

Texture is one of the most important defining features of an image. It is characterized by the spatial distribution of gray levels in a neighborhood [8]. In order to capture the spatial dependence of graylevel values, which contribute to the perception of texture, a two dimensional dependence texture analysis matrix is taken into consideration. This twodimensional matrix is obtained by decoding the image file; jpeg, bmp, etc.

Methods of Representation
There are three principal approaches used
Brain
Tumor Image
Feature Extracti on
Data base images
Trained Probabilis tic Neural Network
Feature Extraction& PNN Training
Tumor Detection
Classific ation
If Abnor mal
Clustering technique
to describe texture; statistical, structural and spectral.

Statistical techniques characterize textures using the statistical properties of the grey levels of the points/pixels comprising a surface image. Typically, these properties are computed using: the grey level cooccurrence matrix of the surface, or the wavelet transformation of the surface.

Structural techniques characterize textures as being composed of simple primitive structures called Texels (or texture elements). These are arranged regularly on a surface according to some surface arrangement rules.


Modules description

GLCM Feature Extraction.

PNN Training and Classification.

Clustering Method for Tumor Detection.


Spectral techniques are based on properties of the Fourier spectrum and describe global periodicity of the grey levels of a surface by identifying highenergy peaks in the Fourier spectrum.
For optimum classification purposes, what concern us are the statistical techniques of characterization.[1]This is because it is these techniques that result in computing texture properties. The most popular statistical representations of texture are:

Cooccurrence Matrix

Tamura Texture

Wavelet Transform

Cooccurrence Matrix
Originally proposed by R.M. Haralick, the cooccurrence matrix representation of texture features explores the grey level spatial dependence of texture [2]. A mathematical definition of the cooccurrence matrix is as follows [4]:



Given a position operator P(i,j),

let A be an n x n matrix

Whose element A[i][j is the number of times that points with grey level (intensity) g[i] occur, in the position specified by P, relative to points with grey level g[j].

Let C be the n x n matrix that is produced by dividing A with the total number of point
pairs that satisfy P. C[i][j] is a measure of the joint probability that a pair of points satisfying P will have values g[i], g[j].

C is called a cooccurrence matrix defined by
P.
Examples for the operator P are: i above j, or
i one position to the right and two below j, etc.
This can also be illustrated as follows Let t be a translation, then a cooccurrence matrix Ct of a region is defined for every greylevel (a, b) by [1]:
t
t
C (a,b) card{(s,s t) R2  A[s] a, A[s t] b}
Here, Ct(a, b) is the number of sitecouples, denoted by (s, s + t) that are separated by a translation vector t, with a being the greylevel of s, and b being the greylevel of s + t.
For example; with an 8 greylevel image representation and a vector t that considers only one neighbor, we would find [1]
Figure: Classical Cooccurrence matrix
At first the cooccurrence matrix is constructed, based on the orientation and distance between image pixels. Then meaningful statistics are extracted from the matrix as the texture representation. Hara lick proposed the following texture features:

Energy

Contrast

Correlation

Homogeneity

Entropy
Hence, for each Haralick texture feature, we obtain a cooccurrence matrix. These co occurrence matrices represent the spatial distribution and the dependence of the grey levels within a local area. Each (i,j) th entry in the
matrices, represents the probability of going from one pixel with a grey level of 'i' to another with a grey level of 'j' under a predefined distance and angle. From these matrices, sets of statistical measures are computed, called feature vectors.
Energy: It is a grayscale image texture measure of homogeneity changing, reflecting the distribution of image grayscale uniformity of weight and texture.
E = P(x,y)2
p(x,y) is the GXLCMY
Contrast: Contrast is the main diagonal near the moment of inertia, which measure the value of the matrix is distributed and images of local changes in number, reflecting the image clarity and texture of shadow depth.
I = (xy)2p(x,y)
Entropy: It measures image texture randomness,
when the space cooccurrence matrix for all values is equal, it achieved the minimum value.
S = – P(x,y) log p(x,y)
X Y
Correlation Coefficient: Measures the joint probability occurrence of the specified pixel pairs.
Correlation: sum(sum((x x)(yy)p(x , y)/xy))
Homogeneity: Measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal.
Homogeneity = sum(sum(p(x , y)/(1 + [xy])))

Discrete Wavelet Transform (DWT)
In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information (location in time).
Types of DWT
There are two types of DWT. They are


One dimensional DWT(1D DWT)

Two Dimensional DWT(2D DWT)
One Dimensional Dwt (1 D)
The DWT of a signal x is calculated by passing it through a series of filters. First the samples are passed through a low pass filter with impulse g resulting ina convolution of the two:
LL1
HL1
LH1
HH1
LL1
HL1
LH1
HH1
y [n ] = (x * g)[n] = x[k]g[nk].
Figure: Block diagram of filter analysis

2D Transform Hierarchy
The generic form for a twodimensional (2D) wavelet transform is shown in Figure.
Figure: 2D Wavelet Decomposition
The 1D wavelet transform can be extended to a twodimensional (2D) wavelet transform using separable wavelet filters. With separable filters the 2D transform can be computed by applying a 1D transform to all the rows of the input, and then repeating on all of the columns.
The signal K i1s also decomposed
simultaneously using a highpass filter h. The outputs giving the detail coefficients (from the highpass filter) and approximation coefficients (from the lowpass). It is important that the two filters are related to each other and they are known as a quadrature mirror filter.
Two Dimensional Dwt (2 D)
However, since half the frequencies of the signal have now been removed, half the samples can be discarded according to Nyquists rule. The filter outputs are then subsample by 2 (Mallat's and the common notation is the
opposite, g h igh pass and h low pass):
Figure: Sub band Labeling Scheme for a one level, 2D Wavelet Transform
The original image of a onelevel (K=1), 2D wavelet transform, with corresponding notation is shown in the above figure. The example is repeated for a threelevel (K =3) wavelet expansion in the below figure. In all of the discussion K represents the highest level of the decomposition of the wavelet transform.
ylow n
xk g 2n k .
yhigh n K
K
xk pn k .
This decomposition has halved the time resolution since only half of each filter output characterizes the signal. However, each output has half the frequency band of the input so the frequency resolution has been doubled.
LL1
HL1
HL2
HL3
LH1
HH1
LH2
HH2
LH3
HH3
LL1
HL1
HL2
HL3
LH1
HH1
LH2
HH2
LH3
HH3
automatically select the correct type of network based on the type of target variable [3].
G. Architecture of a PNN
Figure:Subband labeling Scheme for a Three Level, 2D Wavelet Transform
The 2D subband decomposition is just an extension of 1D subband decomposition. The entire process is carried out by executing 1D subband decomposition twice, first in one direction (horizontal), then in the orthogonal (vertical) direction. For example, the lowpass subbands (Li) resulting from the horizontal direction is further decomposed in the vertical direction, leading to LLi and LHi subbands. Similarly, the high pass subband (Hi) is further decomposed into HLi and HHi. After one level of transform, the image can be further decomposed by applying the 2D subband decomposition to the existing LLi subband. This iterative process results in multiple transform levels. In Fig. 2.14 the first level of transform results in LH1, HL1, and HH1, in addition to LL1, which is further decomposed into LH2, HL2, HH2, LL2 at the second level, and the information of LL2 is used for the third level transform. The subband LLi is a low resolution subband and highpass subbands LHi, HLi, HHi are horizontal, vertical, and diagonal sub band respectively since they represent the horizontal, vertical, and diagonal residual information of the original image.

Probabilistic Neural Networks (PNN):
Probabilistic (PNN) and General Regression Neural Networks (GRNN) have similar architectures but there is a fundamental difference: Probabilistic networks perform classification where the target variable is categorical, whereas general regression neural networks perform regression where the target variable is continuous. If you select a PNN/GRNN network, DTREG will
Figure: Architecture of a PNN
All PNN networks have four layers:

Input layer There is one neuron in the input layer for each predictor variable. In the case of categorical variables, N1 neurons are used where N is the number of categories. The input neuron (or processing before the input layer) standardizes the range of the values by subtracting the median and dividing by the interquartile range. The input neurons then feed the values to each of the neurons in the hidden layer [3].

Hidden layer This layer has one neuron for each case in the training data set. The neuron stores the values of the predictor variables for the case along with the target value. When presented with the x vector of input values from the input layer, a hidden neuron computes the Euclidean distance of the test case from the neurons center point and then applies the RBF kernel function using the sigma value(s). The resulting value is passed to the neurons in the pattern layer[3].

Pattern layer / Summation layer The next layer in the network is different for PNN networks and for GRNN networks. For PNN networks there is one pattern neuron for each category of the target variable. The actual target category of each training case is stored with each hidden neuron; the weighted value coming out of a hidden neuron is fed only to the pattern neuron that corresponds to the hidden neurons category. The pattern neurons add the values for the class they represent (hence, it is a weighted vote for that category)[3].
For GRNN networks, there are only two neurons in the pattern layer. One neuron is the denominator summation unit the other
is the numerator summation unit. The denominator summation unit adds up the weight values coming from each of the hidden neurons. The numerator summation unit adds up the weight values multiplied by the actual target value for each hidden neuron.

Decision layer The decision layer is different for PNN and GRNN networks.

For PNN networks, the decision layer compares the weighted votes for each target category accumulated in the pattern layer and uses the largest vote to predict the target category.
For GRNN networks, the decision layer divides the value accumulated in the numerator summation unit by the value in the denominator summation unit and uses the result as the predicted target value[4].
H. How PNN network work
Although the implementation is very different, probabilistic neural networks are conceptually similar to KNearest Neighbor (k NN) models. The basic idea is that a predicted target value of an item is likely to be about the same as other items that have close values of the predictor variables [3]. Consider this figure:
Assume that each case in the training set has two predictor variables, x and y. The cases are plotted using their x,y coordinates as shown in the figure. Also assume that the target variable has two categories, positive which is denoted by a square and negative which is denoted by a dash. Now, suppose we are trying to predict the value of a new case represented by the triangle with predictor values x=6, y=5.1. Should we predict the target as positive or negative?
Notice that the triangle is position almost exactly on top of a dash representing a negative
value. But that dash is in a fairly unusual position compared to the other dashes which are clustered below the squares and left of center. So it could be that the underlying negative value is an odd case.
The nearest neighbor classification performed for this example depends on how many neighboring points are considered. If 1NN is used and only the closest point is considered, then clearly the new point should be classified as negative since it is on top of a known negative point. On the other hand, if 9NN classification is used and the closest 9 points are considered, then the effect of the surrounding 8 positive points may overbalance the close negative point.
A probabilistic neural network builds on this foundation and generalizes it to consider all of the other points. The distance is computed from the point being evaluated to each of the other points, and a radial basis function (RBF) (also called a kernel function) is applied to the distance to compute the weight (influence) for each point. The radial basis function is so named because the radius distance is the argument to the function [9].
Weight = RBF (distance)
The further some other point is from the new point, the less influence it has.
Radial Basis Function
Different types of radial basis functions could be used, but the most common is the Gaussian function:
CONCLUSION
This study was undertaken to develop an PNN to classify stage of the brain tumor images and detect the Tumor using clustering technique
.grey level index values were assigned to the pixels of the indexed image and used as PNN inputs. There were 15 images, for training, and 8 images for testing. Probabilistic Neural Network with image and data processing techniques was employed to implement an automated Brain Tumor classification [7]. Decision making was performed in two stages: feature extraction using GLCM and the classification using Probabilistic Neural Network (PNN). This paper presents a segmentation method, KMeans clustering algorithm, for segmenting Magnetic Resonance images to detect the Brain Tumor in its early stages. Although the study was limited by the available computational resources and training data, the results indicate the potential of ANNs for fast image recognition and classification. Fast image recognition and classification can be useful in the control of realworld, sitespecific herbicide application.


The paper has been appreciated by all the
REFERENCES
[1]. N. Kwak, and C. H. Choi, Input Feature Selection for Classification Problems,IEEE Transactions on Neural Networks, 13(1), 143159, 2002.
[2].E. D. Ubeyli and I. Guler, Feature Extraction from Doppler Ultrasound Signals for Automated Diagnostic Systems, Computers in Biology and Medicine, 35(9), 735764, 2005. [3].D.F. Specht, Probabilistic Neural Networks for Classification, mapping, or associative memory, Proceedings of IEEE International Conference on Neural Networks, Vol.1, IEEE Press, New York, pp. 525532, June 1988. [4].D.F. Specht, Probabilistic Neural Networks Neural Networks, vol. 3, No.1, pp. 10918, 1990.[5].Georgiadis. Et ali, Improving brain tumor characterization on MRI by probabilistic neural networks and nonlinear transformation of textural features, Computer Methods and program in biomedicine, vol 89, pp2432, 2008

Kaus M., Automated segmentation of MRI
users in the organization.


It is easy to use, since provided in the user dialog.
it uses the GUI
brain tumors, Journal of Radiology, vol.218, pp. 585591, 2001

User friendly screens are provided.

It also provides the user with variable options in customizing the packet capture.

It has been thoroughly tested and implemented.
The presented samples demonstrate that the initial aim of the library was achieved – it is flexible, reusable, and it is easy to use it for different tasks. Although, there is still much work to do, because of a great range of different neural network architectures and their learning algorithms, but still – the library can be used for many different problems, and can be extended to solve even more[7]. I hope the library will become useful not only in my further research work, but other different researchers will find it interesting and useful.
ol>
Kornel, P., Bela, M., Rainer, S., Zalan, D., Zsolt, T. and Janos, F.,Application of neural network in medicine, Diag. Med. Tech.,vol. 4,issue 3,pp: 53854 ,1998.
Messen W, Wehrens R, Buydens L,
Supervised Kohonen networks for classification problems,Chemometrics and Intelligent Laboratory Systems, vol.83,pp:99113,2006.
Orr M.J.L., Hallam J., Murray A., and Leonard .T, Assessing rbf networks using delve," International Journal of Neural Systems, vol. 10, issue 5, pp. 397415, 2000.
Hartigan, J. A.; Wong, M. A. (1979). "Algorithm AS 136: A KMeans Clustering Algorithm". Journal of the Royal Statistical Society, Series