 Open Access
 Total Downloads : 202
 Authors : P. Maniganda, M. Phanikanth, P. Ganapathi.
 Paper ID : IJERTV2IS110981
 Volume & Issue : Volume 02, Issue 11 (November 2013)
 Published (First Online): 30112013
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Network on Chip Using GMM Based Image Classifier
1 P.ManigandaM.E.(Ph.D),Asst.Prof 2 M.Phanikanth.M.tech,Asst.Prof 3 P.Ganapathi.M.Tech,Asst.Prof
1,2,3C.V.S college of Engg, Tirupati. INDIA
Abstract
The aim of this paper is to give briefing of the concept of networkonchip and programming model to Gaussian mixture models (GMM)based classifiers have shown increased attention in many pattern recognition applications. Improved performances have been demonstrated in many applications, In this paper, first the performance of GMM and its hardware complexity are analyzed and compared with a number of benchmark algorithms. First, a serialparallel vectormatrix multiplier combined with an efficient pipelining technique is used.\ A novel exponential calculation circuit based on a linear piecewise approximation is proposed to reduce hardware complexity. The precision requirement of the GMM parameters in our classifier are also studied for various classification problems. The proposed hardware implementation features programmability and flexibility offering the Noc in possibility to use the proposed architecture for different applications with different topologies and precision requirements.
Index Terms: Network On Chip, GMM, pattern recognition, reconfigurable architecture.

Introduction
NetworkonChip (NoC) is an approach to design the communication subsystem between IP cores in a System On a Chip (SoC). NoCs can span synchronous and asynchronous clock domains or use unclocked asynchronous logic. This NoC brings an effective improvement over conventional busses and cross bar switches. The power requirement of the SoC is high where as it can be reduced by the NoC architecture. NoC is an developing paper in the field of VLSI. Since the use of emerging NoC architecture in VLSI it reduces the size of the architecture due to the reduced amount of buses and transmission lines. The GAUSSIAN mixture models (GMM) classifier has gained increasing attention in the pattern recognition community. GMM can be classified as a semi parametric density estimation method since it defines a very general class of functional forms for the density model. In this mixture model, a probability density function is expressed as a linear combination of basis functions . While GMM provides very good performances and interesting properties as a classifier, it presents some problems
that may limit its practical use in realtime applications. One problem is that GMM can require large amounts of memory to store various coefficients and can require complex computations mainly involving exponential calculations. Thus, this scheme can be put to efficient practical use in Network on Chip implementation strategies are developed. In this paper, we propose Network On Chip GMMbased classifiers. First, after analyzing the complexity of the GMM classifier it was found that the vectormatrix multiplication and the exponential calculations are the most critical operations in the classifier. A good tradeoff between realtime processing and hardware resources requirements is obtained using a serial parallel architecture and an efficient pipelining strategy. Second, a linear piecewise function (LPF) is proposed to replace the exponential calculation.. The effect of both limited precision and the mixture models approximation using LPF on the classification performance is investigated using seven different datasets. These datasets are also used to compare the performance of GMM with other benchmark classifiers.The design was made flexible and programmable making it a general purpose processor which can be applied to different classification problems.

Hard ware Proposal Noc components:
Fig1: Noc representation.
Network Interface Controller (NIC). The NIC implements the interface between each IP node and the communication infrastructure. The architecture of the NIC component can be divided into two modules. The first one is focused on the interaction with the computation or memory node bus, and the other one is focused on the interaction with the rest of the NoC. This component is called in many different ways in NoC literature: NI for Network Interface, NA for Network Adapter, or RNI for Resource Network Interface are some examples.
Fig 2:Generic Router block diagram
Router. Also called switch. These components are in charge of forwarding data to the next tail. On the routers we can find implemented the routing protocol, buffer capabilities and the switching method. In general, the router component is composed of the next elements: Arbitrer, which its main task is to grant channels (selecting an input port and an output port) and route packets; Crossbar, of n input x n output ports that direct the input packet to the corresponding output port; and buffer or queue, if that is the case as it is in the packet switching protocols which is used to buffer incoming and outgoing data in the router.
Fig3: Data flow diagram of GMM classifier
3.Design Modules:
Fig3: Data flow diagram of GMM classifier
3.Design Modules:

Data Flow Blocks Of a GMM Classifier:

Serialparallel Vectormatrix Multiplier.

Linear Piecewise Function Unit.

WTA (Winnertakesall) Circuit.


Functional Blocks (Algorithm)Of The GMM Classifier System:
Fig4:the functional block diagram of the overall GMM classifier system.

RegGMM:
The architecture includes two main registers RegX and RegGMM used to store the input pattern x and the GMM parameters (Âµ,G, k) respectively. The main computational block of the system is the GMM processor used to calculate (x/Ck) (Ck) for each class in turn and, hence, makes the final class assignment decision for a given input pattern.

Serial Parallel Vector Multiplexer:
The GMM processor includes a serial parallel vector matrix multiplier, a square and multiplier units, a LPF unit, a winner takes all (WTA) circuit, and two accumulators. A 10 bit data bus is used in order to load the GMM parameters and the test vector data using the load signals .Once all the parameters are loaded, the GMM processor can operate on realtime basis by processing data in a systolic fashion. Initially, the Gj matrix is multiplied by the Sj = xÂµj vector. The resulting vector Yj is then fed to a square unit followed by an accumulator which performs the summation of all the squared Zj components of the vector resulting in the value Zj as described in

LPF unit:
The result is fed to the LPF unit and is multiplied by constant K j The multiplication result represents a single parameter K j exp{Zj}, which when accumulated M times will lead to the value of (xCk) (Ck) as described by (15). An accumulator is, therefore, required at the output of the exponential block which is iterated M times.
4.4..WTA CIRCUIT:
The values (xCk) (Ck) are then compared one by one and the maximum value is selected and its corresponding
class is assigned to the test pattern. In what follows, we will describe the important building blocks of the GMM processor including the serial parallel vector matrix multiplier, the LPF unit as well as the WTA circuit.


Serialparallel Vectormatrix Multiplier: Vector triangular matrix multiplication for d=3
Fig5:Vector triangle multiplication for d=3.
where y1=s1g11+s2g21+s3g31. y2=s2g22+s3g32. y3=s3g33.
T
T
Where Sj = X – Âµ The vectormatrix multiplier is used to calculate , Yj = Sj Gj where Sj = X – Âµ The vectormatrix multiplier is optimized for our specific need of vector triangular matrix multiplication. Fig. 8 describes the vector triangular matrix multiplication for d=3
The multiplication can be simplified by decomposing the G matrix into row vectors which are then multiplied by components. Once this is achieved, the final resulting row vector Y is obtained by summingup the intermediate results row wise. One can note that Y3 just requires one multiplication whileY2 and Y1 require two and three multiplications, respectively.
Due to this property, we can first multiply S3 with the 3rd row of G to generateY3 and the partial results of Y1 and Y2 are temporarily accumulated. Next, S2 is multiplied with the 2nd row of G and Y2 can be generated after accumulation. This multiplication was achieved using an efficient systolic architecture illustrated in Fig.5, which permits to obtain a good tradeoff between operation speed and chip area. The
elements of vector S are fed into the multiplier serially while the row components G of are provided in parallel.

Serial parallel vectormatrix multiplier:
T
T
For simplicity reasons, the figure only shows a maximum dimension of 5. Note that it is possible to use the structure for a lower dimension since zero components can be inserted in non utilized blocks. The vector matrix multiplier is used to calculate Yj = Sj G j where Sj = X – Âµ The multiplication can be simplified by decomposing the G matrix into row vectors which are then multiplied by Sj
components. Once achieved the final resulting row vector y is obtained by summingup the intermediate results row wise
Fig6:SerialParllel vectormatrix multiplier.

Computation Sequences Diagram For Serial parallel Vector Multiplier:
*Serial Parallel Vector Matrix Multiplier Combined with an Efficient Pipelining Technique is used.
Fig7:This computation sequence shows five clock cycles.
y1=s1g15+s2g24+s3g33+s4g42+s5g51. y2=s2g25+s3g34+s4g43+s5g52. y3=s3g35+s4g44+s5g53. y4=s4g45+s5g54.
y5=s5g55.
At the first clock cycle (t 1), g5i are fed to the vector matrix multiplier together with S5. During this first cycle, output Y5 = S5 g55 is obtained. At the next clock cycle (t2),g4i, are fed to the vector matrix multiplier together with and Y 4 = s4g54 + S 4g44 is obtained. The procedure is continued until the resulting vector components yi are obtained in five clock cycles.


Linear piecewise function unit:
*A novel exponential calculation circuit based on linear piecewise approximation is proposed to reduce hardware complexity.
*The proposed hardware implementation features programmability and flexibility offering the possibility to use the proposed architecture for different applications with
1, if z a
(b z), if a z b
f (z)
different topologies.
* The linear piecewise function unit; the circuit can be
2
0, if
z b
configured in three different modes realizing i.e. f1 (z), f2 (z), f3 (z).
6.1.The Architecture of the Linear Piece Wise Function Unit:
Fig8:Architecture of Linear piecewise Function

In this digital architecture of LPF unit requires R1 to R6 registers to store the different parameter of approximation.

R7 is input register which is loaded by the input data z (40 bits)

SR1 is 40bit shift register with set and reset options, SR1can shift the data by a number of bit set by the value stored in R6.

Three comparators C1 to C3 is allowing to compare the input data z with the values stored in registers R1 toR3.

The LPF unit operates in three possible linear piece wise functions i.e. f1 (z), f2 (z) and f3 (z).
MODE 1:
Fig9:The exponential approximation of f1 (z) on Gaussian model
The implementation of f1(z) is straightforward.
A comparator can be used to compare z and a If z<a the output is 1, otherwise, the output is 0. For f2(z) two comparators are required. If z<a or z b, the output is 1 or 0
In the f1 (z): a is stored in both R1 and R2, C1 compares input z with a.
If z<a, the output of C1 is low and the output of SR1 will be set to 1.
If z>=a, the output of C2 is high and this signal is used to reset SR1.
MODE 2
Fig10: the exponential approximation of f2(z) on Gaussian model
In the f2(z): a is stored in R1 and R3 while b is stored in R2 and R5.
C1 operates similarly to the case described in mode 1.
If z>=a, C2 will compare z and b SR1 will be reset,.
z will be loaded to register R9 and C3 will compare z with content of R3.
(bz) is then calculated and loaded into SR1(C3 high).
The output of SR1 is (bz) if a<=z<b.
MODE 3:
1, if z a
2n m (b'z),if a z b f3 (z) (c z), if b z c
00,, if z c
Fig11:The exponential function approximation of f3 (z) on Gaussian model
In the f3 (z): a, b, b` are stored in R1, R3 and R4, respectively, while c is Stored in R2 and R5. R6 stores the value of (nm).
If z<a, C1 will set SR1, later will be reset by C2 if z>=b.
In the case where b<=z<c, the operation is the same as that of mode2 and the output of SR1 will be (cz).
If a<=z<b, the output of C3 will be low and the subtract or will calculate (i – s)
The result is shifted to the left by (nm), the output of SR1is set to 2nm(b` – z)

WTA CIRCUIT
Fig12.Block diagram of WTA Circuit.
In WTA circuit, the values (x/Ck) (Ck) need to be compared one by one before making the final class assignment decision. The maximum value is selected and its corresponding class is assigned to the test pattern.
Initially R10 AND R11are reset and the initial control sequence stored in the shift register SR2 is 10000.
Each bit within the 5bit control sequence stored in SR2 is responsible for enabling one DFF, by shifting control sequence in SR2 the output of the comparator is loaded in a new DFF and comparing component
The result stored in D1, similarly D1 to D5 will be zeroed except for one FF which will corresponds to

Result:

Out put wave form for GMM Classifier:
Fig13:Output waveform of GMM Classifier.

Input Image:
Fig13:Input Image for GMM Classifier.
3.1.Output Images for GMM Classifier:
Fig14: Output Image1 for GMM Classifier
Fig15: Output Image2 for GMM Classifier
Fig16: Output Image3 for GMM Classifier
Fig17: Output Image4 for GMM Classifier
Fig18: Output Image5 for GMM Classifier

Softwares Used:

Xilinx

Mat Lab 3.Applications:

*G.M.M. Based classifiers are used to pattern recognition
application.
*Classifying the data (image).
*Medical applications.
4.Advantages:
*Classification rate is high.
*As comparing this is accurate method of result.


CONCLUSION
In this paper, Networkonchip topic remains as an attractive research field for academia that raises its popularity year by year .we presented a pattern recognition system based on a GMM classifier. Simulation results suggest that the GMM classifier presents the best classification performance with acceptable complexity when compared to KNN, MLP, RBF, and PPCA. This hardwarefriendly architecture is based on the use of poweroftwo coefficients as well as LPFbased GMM. The proposed procedure consists of building an original GMM based on the training dataset and then optimizing the parameters of the LPFbased GMM using the classification performance as an optimization criteria
A prototype chip was designed using autmatic placement and routing based on 0.25 m technology occupying an area of 1.69 mm . Successful operation of the architecture is demonstrated using NOC through simulation results as well as experimental tests for gas identification application requiring ten Gaussian models and five classes. A classification performance of 92% was achieved for 100 input patterns processed in less than 57 s. To the best of our knowledge, this prototype represents the first reported hardware implementation of a GMM classifier.

REFERENCES

Hilton, C.; Nelson, B.; , "PNoC: a flexible circuitswitched NoC for FPGAbased systems," Computers and Digital Techniques, IEE Proceedings – , vol.153, no.3, pp. 181 188, 2 May 2006

Pavlidis, V.F.; Friedman, E.G.; , "3D Topologies for NetworksonChip," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.15, no.10, pp.1081 1090, Oct. 2007

S. L. Phung, A. Bouzerdoum, and D. Chai, Skin segmetation using color pixel classi fication: Analysis and comparision, IEEE Trans. Pat tern Anal. Mach. Intell. , vol. 27, no. 1, pp. 148 154, Jan. 2005.

S. BBelhouari, A. Bermak, M. Shi, and P. Chan, Fast and robust gas identification system using an integrated gas sensor technology and Gaussian mixture models, IEEE Sensors J. , vol. 5, pp. 14331444,Oct. 2005.

Y. Huang, K. B. Englehart, B. Hudgins, and A. D. C. Chan, Opti mized Gaussian mixture models for upper limb motion classification, in Proc. 26th Ann. Int. Conf. Eng. Med. Bio. Society, (EMBC), Conf.2004, pp. 7275.

C. P. Lim, S. S. Quek, and K. K. Peh, Application of the Gaussianmixture model to drug dissolution profiles prediction, Neur. Comput.Appl., vol. 14, pp. 345 352, 2005.
ACKNOWLEDGEMENT
Pujali.Maniganda, M.E (PhD), Associate.Prof. in Chadalawada Venkata Subbaiah College of Engineering in Tirupathi. & Perusing Ph.D in Veltech Dr.RR. & Dr.SR. Technological University in Chennai.;
M.Phanikanth M.Tech Asst..Prof. in Chadalawada Venkata Subbaiah College of Engineering in Tirupathi.
P.Ganapathi M.Tech Asst..Prof. in Chadalawada Venkata Subbaiah College of Engineering in Tirupathi.