 Open Access
 Total Downloads : 191
 Authors : Ms. G. Vinothini, Ms. K. R. S. Deepa, Ms. V. Suganyaa, Prof. J. Shafiq Mansoor
 Paper ID : IJERTV3IS031473
 Volume & Issue : Volume 03, Issue 03 (March 2014)
 Published (First Online): 29032014
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Reconfigurable Finite Field Multiplier For Fir Filter Architecture
Ms. G. Vinothini1, Ms. K. R. S. Deepa2, Ms. V. Suganyaa3, Prof. J. Shafiq Mansoor4
1,2,3 Master of Engineering(VLSI DESIGN) , ECE,
4 Working as Assistant Professor, ECE, Karpagam University, Coimbatore, Tamil Nadu, India.
Abstract – At present scenario, many multipliers based on the KaratsubaOfman algorithm (KOA) polynomial based multiplier was proposed that sacrificed time efficiency for low space complexity. In this paper, a new multiplication formula which is a variant of KOA presented. We also provide a straightforward architecture of a nonpipelined bitparallel multiplier using the new formula. It is further shown that the systolic structure can be decomposed into two or more parallel systolic branches, where the pair of parallel systolic branches has the same input operand, and they can share the same input operand registers. From the applicationspecific integrated circuit and fieldprogrammable gate array synthesis results we find that the proposed design provides significantly less area delay and powerdelay complexities over the best of the existing designs. The proposed multiplier has lower space complexity than and comparable time complexity to previous Mastrovito multipliers for all irreducible trinomials and systolic array structure
Index Terms KaratsubaOfman algorithm (KOA), non pipelined bitparallel multiplier, finite impulse response (FIR) filters, Mastrovito multipliers

INTRODUCTION
FINITE impulse response (FIR) filters square measure of nice importance in digital signal process (DSP) systems since their characteristics in linearphase and feed forward implementations build them terribly helpful for building stable highperformance filters. The transposed form FIR filter implementations square measure illustrated in Fig. 1(a) though each architectures have similar quality in hardware, the converse kind is usually most popular
because of its higher performance and power potency.The number block of the digital FIR filter in its converse form [Fig. 1(a)], wherever the multiplication of filter coefficients with the filter input is accomplished, has vital impact on the quality and performance of the look as a result of a large number of constant multiplications square measure needed.
This is generally called the multiple constant multiplications (MCM) operation and is additionally a central operation and performance bottleneck in several different DSP systems like
x (n) MULTIPLIER BLOCK
hk1 hk2 hk3 h0 D D
Y(N)
Multiple constant multiplication (MCM) operation
(a)
X(n)
D D
Y(n)
(b)
Fig. 1. FIR filter implementations (a) converse kind with generic multipliers. (b) Converse kind with associate degree MCM block.
Fast Fourier transforms, discrete cosine transforms (DCTs), and errorcorrecting codes. Although area, delay, and powerefficient number architectures, such as Wallace and changed Booth multipliers, have been projected, the total flexibility of a number is not necessary for the constant multiplications, since filter coefficients square measure mounted and determined beforehand by the DSP algorithms [1]. Hence, the multiplication of filter coefficients with the input file is usually enforced underneath a shift adds architecture [2], wherever every constant multiplication is realized victimization addition/subtraction associate degreed
shift operations in an MCM operation [Fig. 1(b)].For the shiftadds implementation of constant multiplications , a straightforward methodology, typically called digit based recoding, at the start defines the constants in binary. Then, for every 1 within the binary illustration of the constant, according to its bit position, it shifts the variable and adds up the shifted variables to get the result.
However, the digitbased coding technique doesn't exploit the sharing of common partial merchandise, that permits great reductions within the variety of operations and, consequently in space and power dissipation of the MCM style at the gate level. Hence, the basic improvement downside, called the MCM downside, is outlined as finding the minimum number of addition and subtraction operations that implement the constant multiplications. Note that, in bitparallel style of constant multiplications, shifts are often completed victimization solely wires in hardware while not representing any space value. The algorithms designed for the MCM downside are often categorized in 2 classes: common sub expression elimination (CSE) algorithms [3]

and graphbased (GB) techniques [5][6]. The Communications Security Establishment algorithms at the start extract all potential sub expressions from the representations of the constants when they are outlined below binary, canonical signed digit (CSD), or borderline signed digit (MSD) [3]. Then, they find the best sub expression, usually the foremost common, to be shared among the constant multiplications. The GB ways are not restricted to any specific variety illustration and consider a bigger variety of different implementations of a constant, yielding higher solution than the Communications Security Establishment algorithms, as shown in [5] and [6].
However, of these algorithms assume that the input file x is processed in parallel. On the opposite hand, in digit serial arithmetic, the info words are divided into digit sets, consisting of d bit , that are processed one at a time. Since digit serial operators occupy less space and are freelance of the data word length, digitserial architectures provide various low complexity designs when put next to bitparallel architectures. However, the shifts need the employment of D flipflops, as opposed to the bitparallel MCM style wherever they're free in terms of hardware. Hence, the high level algorithms ought to take into consideration the sharing of shift operations furthermore because the sharing of addition/subtraction operations in digitserial MCM design. Moreover, finding the minimum variety of operations realizing Associate in Nursing MCM operation doesn't perpetually yield Associate in Nursing MCM style with best space at the gate level . Hence, the highlevel algorithms ought to think about the implementation cost of every digit serial operation at the gate level .In this paper, we tend to at
the start confirm the gatelevel implementation costs of digitserial addition, subtraction, and left shift operations utilized in the shiftadds style of digitserial MCM operations. Then, we tend to introduce the precise Communications Security Establishment formula that formalizes the gatelevel space improvement problem as a 01 whole number applied mathematics (ILP) downside when constants are outlined below a selected variety representation. We tend to additionally gift a brand new improvement model that reduces the 01 ILP downside size considerably and, consequently, the runtime of a generic 0 1 ILP thinker. Since there are still instances that the precise Communications Security Establishment formula cannot handle, we tend to describe the approximate GB formula [7] that iteratively finds the best partial product that ends up in the optimal space in digitserial MCM style at the gate level.
Experimental results on a comprehensive set of instances show that the solutions of algorithms introduced during this paper lead to vital enhancements in space of digitserial MCM designs compared to those obtained victimization the algorithms designed for the MCM downside. The digitserial FIR filter designs obtaine by SAFIR additionally indicate that the conclusion of the multiplier factor block of a digitserial FIR filter below the shift adds architecture considerably reduces the realm of digitserial FIR filters with relevance those designed victimization digitserial constant multipliers. To boot, it's ascertained that the optimal exchange between space and delay in digitserial FIR filter styles are often explored by dynamical the digit size d.


APPROXIMATE GB ALGORITHM
This algorithm may be a graphbased style methodology which allows common operations like convolution to be enforced victimization reduced numbers of arithmetic operations. This system tries to represent the outputs of the constant multipliers into the graph and whenever it's potential, the technique uses the common components. The primary half is rule and also the second part may be a heuristic methodology. Within the 1st half, if the set of coefficients is totally synthesized, then minimum adder price is gained. The second half uses a lookup table for every constant. The rule primarily consists of the subsequent steps:

Reduce all coefficients within the set to odd fundamentals.

Evaluate all singlecoefficient prices by victimization the price operation table.

Remove all cost0 fundamentals

Create the graph illustration of elect fundamentals.
Graphbased algorithms area unit bottomup strategies that iteratively construct the graph representing the multiplier factor block. The graph construction is target hunting by a heuristic that determines future graph vertex to feature to the graph. Graphbased algorithms supply additional degrees of freedom by not being restricted to a specific illustration of the coefficients, or a predefined graph topology (as in digitbased algorithms), and usually manufacture solutions with the bottom variety of operations. Samples of graphbased algorithms embrace, RAGn
This paper proposes a replacement graphbased rule. The improvement of gatelevel space downside in digitserial MCM style is associate degree NPcomplete downside thanks to the NPcompleteness of the MCM downside. Thus, naturally, there'll be perpetually 01 ILP issues generated by the precise international intelligence agency rule that current 01 ILP solvers notice tough to handle. Hence, the GB heuristic algorithms, that gets decent answer victimization less procedure resources, area unit indispensable. In our approximate rule known as MINAS DS, as exhausted algorithms designed for the MCM downside given in Definition one, we discover the fewest variety of intermediate constants specified all the target and intermediate constants area unit synthesized employing a single operation. However, whereas choosing associate degree intermediate constant for the implementation of the not yet synthesized target constants in every iteration, we tend to favor the one the potential intermediate constants which will be synthesized victimization the smallest amount hardware and can change United States to implement the notyet synthesized target constants in a smaller space with the out there constants.
After the set of target and intermediate constants
Polynomial basis multipliers area unit popularly used as a result of they're comparatively easy to style, and supply quantifiability for the fields of higher orders. Economical hardware style for polynomialbased multiplication is thus necessary for period applications [10], [11].
The pentanomialbased Evariste Galois field is wide used, since the National Institute of Standards and Technology (NIST) has counseled 3 pentanomials for ECC application [12]. There area unit some pulsation realizations of pentanomialsbased number for highthroughput implementation [13][14]. In a very recent paper [15], Meher has bestowed Associate in nursing economical pulsation style of number supported irreducible pentanomials. it's found that the pulsation structure for field multiplication in [15] incorporates a latency of m cycles.
During this paper, we've got extended the structure of additional to get a lower latency pulsation structure. Initial of all, Associate in nursing economical Montgomery algorithmic rule for pentanomials is planned where the multiplication is rotten into variety of freelance elements that might be processed in parallel. Moreover, we've got introduced a completely unique precomputed addition (PCA) technique such the latency of a number may be reduced additional. The planned structure achieves considerably less time quality than the corresponding existing structures. The rest of the paper is organized as follows: The planned algorithmic rule and therefore the PCAtechnique for finite field multiplication GF (2m over supported irreducible pentanomials area unit bestowed in Section II. In Section III, structure of the planned number is delineated. In Section IV, we've got mentioned the estimation of hardware and timecomplexities and compared those with those of the prevailing styles. Based on the planned algorithmic rule, we have a tendency to derive here the planned structure of the number.
Weve got taken f(x)=x13+x4+x3+x2+1 because the irreducible pentanomial for example the planned low latency systolic structure (forl=1 ). It will but simply be extended to different pentanomials..
From C+ of, similarly, we can get C as
C =(A(1)b + A(3)b + A(5)b ) third unit
that realizes the MCM operation is found, each constant is
1
(2)
2
(4)
0
(6)
synthesized exploitation Associate in Nursing Aoperation
+ A b5+ A
b3+ A
b1) fourth unit
that yields the minimum area among the digitserial MCM vogue. In MINASDS, the globe of the digitserial MCM operation is ready as a result of the entire gatelevel
Where c+ and c is
c =i=1 xi
=(x +x ++x )
implementation value of each digitserial addition,
+1
+i+1
pi+1
subtraction, and shift operation beneath the digit size parameter d.


FINITE GF (2M) FIELD SUPPORTED
PENTANOMIAL
Finite field multipliers over GF(2m) have wide applications in elliptic curve cryptography (ECC) and error management committal to writing systems[8],[9].
(xr+xi+r++xu)
++
(xi+x2i++xpi) 1
c+ = i=u Yi
=(Yu+Y1+u+Yu+Q1)
++ (Yu+t+Y1+u+t++Ym1)
++
(Yu+i1+Y2i+u1+.+Yu+qt1) 2
For simplicity of dialogue, we've denied the 2 parallel units. Because the third and fourth units, severally, those area unit totally different. Mistreatment the projected PCA technique, one will derive and from and, that involves a delay of solely. each the forward and inverse blocks accommodates 3 major steps, i.e., the proton magnetic resonance operation the bitmultiplication operation, and therefore the bit addition operation for systolic implementation of multiplication over, the operations of those steps area unit depicted by the SFG, wherever every section (within the dotted box, solely 2 units area unit presented) corresponds to at least one of the units. The first unit consists of three nodes , three bitmultiplication nodes and a pair of bit addition nodes AD, whereas the second unit has another node , another node and another node The projected systoliclike structure for field multiplication over supported pentanomial (forl=1). It consists of 4 heartbeat arrays, wherever every of the arrays correspond to at least one of the units. The primary heartbeat array consists of five PEs, whereas every of the opposite arrays (second, third, and fourth heartbeat arrays) consists of 4 PEs and a delay cell (the delay cell is needed to satisfy the information dependence requirement). Though one will use asystolic adder array consisting of 4 addition cells (ACs) for the ultimate addition of the four arrays, we have a tendency to use a pipelined addertree consisting of 3 ACs for a coffee latency implementation.
Fig 2 Output waveform for the pentonomial algorithm
From this output waveform the multiplier has less area and less delay because of the same width. So we can reduce the adders and blocks of multipliers. The application of multiplier is error detection and correction
Table1 gate level results on pentonomials algorithm
Algorithm
D
2
4
8
Graph based
Area(mm2)
86
93.4
109.2
Minasds
Delay(ns)
4.6
5.47
7.21
Power(nw)
194
231
312
Irreducible
Area(mm2)
72.4
81.7
98.3
pentanomials
Delay(ns)
3.7
4.5
6.1
Power(nw)
190
229
309

CONCLUSION
The projected style involves considerably less area delay and powerdelay complexities than the fresh according number for irreducible pentanomial, with nearly quartern of the latency of the opposite, for the federal agency suggested pentanomials. Since there are a unit still instances with that the precise CSE algorithmic program cannot cope, we have a tendency to conjointly project AN approximate GB algorithm that finds the simplest partial merchandise in every iteration which yield the optimum gatelevel space in digitserial MCM design. The experimental results indicate that the complexness of digit serial MCM styles is more reduced mistreatment the high level optimization algorithms projected during this paper. The application mistreatment of error correction and detection. It is observed that a designer will realize the circuit that matches best in an application by everchanging the digit size
REFERENCES

J. McClellan, T. Parks, and L. Rabiner, A computer program for designing optimum FIR linear phase digital filters, IEEE Trans. Audio Electroacoust., vol. 21, no. 6, pp. 506526, Dec. 1973.

H. Nguyen and A. Chatterjee, Numbersplitting with shiftandadd decomposition for power and hardware optimization in linear DSP synthesis, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 8, no. 4, pp. 419424, Aug. 2000.

I.C. Park and H.J. Kang, Digital filter synthesis based on minimal signed digit representation, in Proc. DAC, 2001, pp. 468473.

L. Aksoy, E. Costa, P. Flores, and J. Monteiro, Exact and approximate algorithms for the optimization of area and delay in multiple constant
multiplications, IEEE Trans. Comput.Aided Design Integr. Circuits Syst., vol. 27, no. 6, pp. 10131026, Jun. 2008.

Y. Voronenko and M. PÃ¼schel, Multiplierless multiple constant multiplication, ACM Trans. Algor., vol. 3, no. 2, pp. 139, May 2007.

L. Aksoy, E. Gunes, and P. Flores, Search algorithms for the multiple constant multiplications problem: Exact and approximate,
J. Microprocess.
Microsyst., vol. 34, no. 5, pp. 151162, Aug. 2010.

L. Aksoy, C. Lazzari, E. Costa, P. Flores, and J. Monteiro, Efficient shiftadds design of digitserial multiple constant multiplications, in Proc. Great Lakes Symp. VLSI, 2011, pp. 6166.

Elliptic Curves in Cryptography, ser. London Mathematical Society Lecture Note Series. Cambridge, U.K.: Cambridge Univ. Press, 1999
.

Cryptographic applications of brahmaquptabha skara equation,IEEE Circuits Syst. I, Reg. Papers, vol. 53, no. 7, pp. 15651571, Jul. 2006

H. Wu, Bitparallel polynomial basis multiplier for new classes of finite fields, IEEE Trans. Comput., vol. 57, no. 8, pp. 10231031, Aug. 2008.

P. K. Meher, On efficient implementation of accumulation in finite field over and its applications, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 17, no. 4, pp. 541550, Apr. 2009.

Digital Signature Standard (DSS), FIPS 1862, National Institute of Standards and Technology, 2000.

T. Zhang and K. K. Parhi, Systematic design of original and modified mastrovito multipliers for general irreducible polynomials, IEEE Trans. Comput., vol. 50, no. 7, pp. 734749, Jul. 2001.

C.Y. Lee, J.S. Horng, I.C. Jou, and E.H. Lu, A digitserial multiplier for finite field , IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 13, no. 4, pp. 476483, Apr. 2005.

P. K.Meher, Systolic and nonsystolic scalable modular designs of finite field multipliers for reedsolomon codec, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 17, no. 6, pp. 747757, Jun. 2009.