 Open Access
 Total Downloads : 46
 Authors : M. Sumesh , K. Manoharan , S. Suguna
 Paper ID : IJERTV8IS050227
 Volume & Issue : Volume 08, Issue 05 (May 2019)
 Published (First Online): 14052019
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Ultra Low Power Finite Impulse Response Filters (FIR)
3
M. Sumesp, K. Manoharan2, S. Suguna
ME Scholar1, Assistant Professor2, 3,
Department of ECE, SVS College of Engineering, Coimbatore, Tamil Nadu, India
Abstract: The present work represents the minimization of power and area. The delay is the one of the governing parameters in a digital signal processing circuits with Finite impulse Response filters (FIR) being an important element of it the architecture. FIR Filter contains basically three components viz multiplier, adder and delay unit. Multiplication is the reason of area delay and power consumption in digital circuits and in the present work the performance is dependent on multiplier in FIR Filters. To improve the efficiency of multiplier unit, its needed to optimize various parameters such as speed and area. The design of full adders for low power is obtained and low power units are implemented on the proposed multiplier and the results are analyzed for better performance and it optimized. The FIR filters are designed in both direct form method and transposed form method. The low power filters are simulated and implemented using Tanner EDA tool.
Index Terms: Finite Impulse Response Filters, Full Adder, Half Adder, Carry save, Braun Multiplier, Vedic Multiplier, Wallace Tree Multiplier.
I INTRODUCTION
Finite Impulse Response (FIR) filters are widely used in Digital Signal Processing (DSP) applications due to their stability and linearphase property. In today scenario, low power consumption and less area are the most important parameter for the fabrication of DSP systems and high performance systems. Nowadays, many finite impulse response (FIR) filter designs aimed at either low area or high speed or reduced power consumption are developed. With the increase in area, hardware cost of these FIR Filters is increasing. This leads to design a low area FIR filter with the advantage of moderate speed performance. Adders are very important in variety of digital system, [1]

[3] [4] [5] Many fast adder exits, but adding fast with low delay and power is still challenging. In many computers and other kind of processor adders are used not only in the arithmetic logic units, but also in other part of processor, where they are used to calculate addresses, table indices, and similar operations. On the basis of requirements such as area, delay and power consumption there are some complex adders such as Ripple Carry Adder, Carry lookAhead Adder and Carry Select Adder. Ripple Carry Adder (RCA) shows the compact design but their computation time is longer.[6] [7] Applications that has time as a critical factor make use of Carry LookAhead
Adder (CLA) to derive fast results but it leads to increase in area. But the carry select adder provides a compromise between the small areas but longer delay of RCA and large area with small delay of Carry Look Ahead adder.
The implementation of an FIR filter requires three basic Building blocks. They are Multiplication, Addition and Signal delay. Multipliers consume the most amount of area in a FIR filter design. As the multiplier is the slowest element in the system, it will affect the performance of the FIR filter. [8] [9] [10] To reduce the power Consumption of FIR Filter, function of adder is minimized by a technique called scaling and
Roundingoff Filter coefficient and truncation of unnecessary bits. Evaluation of power, area and speed for different types of adders and multipliers will be carried out and the FIR filter will be design with optimized combination of adders and multipliers for low power and high speed application
Filters are very important part of digital signal processing applications. Filters have 2 uses, one is signal separation and other one is signal restoration. In general filtering is described simple convolution operation such as, [11] [12] Where L is the length of FIR filter(n) is the filter impulse response coefficients, x(n) is input sequence and y(n) is the output of FIR filter. The above equations can also expressed in Z domain as Y(z)=x(z)H(z). We present two highspeed and lowpower Fulladder cells designed with an alternative internal logic structure and passtransistor logic styles that lead to have a reduced powerdelay product (PDP).
II VARIOUS MULTIPLIERS

Braun Multiplier
Braun multiplier is a one of the parallel multipliers. It is also referred to as Carry Save Multiplier. Full adders and AND gates are present in the architecture of this multiplier. AND gates are connected in parallel; the partial products are obtained as result of it. Each partial product is summed up with the sum of previously produced partial products utilizing adders. Generally, N X N Braun multipliers consists of n2 AND gates and N (N1) full adders. The
drawback of this multiplier is that the number of components used in building the multiplier increases four times with the increase in the number of bits.

Array Multipliers
Array multipliers can be implemented by directly mapping the manual multiplication into hardware. The partial products are accumulated by an array of adder circuits. An n x n array multiplier requires n (n1) adders and n2 AND gates. Array multipliers have a large critical path and are very slow. The main advantage of these multipliers is the regular structure which leads to ease of layout and design.

Carry saves Array Multiplier
The carrysave array multiplier uses an array of carrysave adders for the accumulation of partial product. It uses a carrypropagate adder for the generation of the final product. This reduces the critical path delay of the multiplier since the carrysave adders pass the carry to the next level of adders rather than the adjacent ones.

Wallace Tree Multiplier



S. Wallace (1964) propounded a fast technique to perform multiplication. A Wallace tree multiplier offers faster performance for large operands. Unlike an array multiplier the partial product matrix for a treemultiplier is rearranged in a treelike format, reducing both the critical path and the number of adder cells needed. The Wallace tree consists of numerous levels of such column compression structures until finally, only two fullwidth operands remain. These two operands can then be added using fast carrypropagate adder to obtain the product result. Figure 3.8 shows a 4 X 4 Wallace tree Multiplier that produces a 8bit output. What differentiates the Wallace tree multiplier from other column compression multipliers is that in the Wallace tree every possible bit in every column is covered by the (3:2) or (2:2). There are numerous ways to implement a Wallace tree multiplier.

Vedic Multiplier
Vedic Mathematics hails from the ancient Indian scriptures called Vedas or the source of knowledge. This system of computation covers all forms of mathematics, be it geometry, trigonometry or algebra. Vedic mathematics is part of four Vedas (books of wisdom). It is part of S thapatya Veda (book on civil engineering and architecture), which is an unpaved (supplement) of Atharva Veda. It covers explanation of several modern mathematical terms including arithmetic, geometry (plane, coordinate), tri goniometry, quadratic equations, factorization and even calculus. In Vedic mathematics there are 16 sutras (formulae) and 16 Upa sutras (sub formulae). Among sutras three are used for multiplication. Urdhava Tiryakbhyam is a Sanskrit word which means vertically and crosswire in English. The method is a general multiplication formula applicable to all caes of multiplication. The method can be generalized for any N x N bit multiplication. Vedic Multiplier for 4×4 bit Divide the no. of bits in the inputs equally in two parts. Lets analyze 4×4 bit multiplication, say multiplicand A=A3A2A1A0 and multiplier B= B3B2B1B0. Following
are the output line for the multiplication result, S7S6S5S4S3S2S1S0.Lets divide A and B into two parts, say A3 A2 & A1 A0 for A and B3 B2 & B1B0 for B.
Using the fundamental of Vedic multiplication, taking two bit at a time and using 2 bit multiplier block.
III PROPOSED SYSTEM
As the multiplier is the slowest element in the system, it will affect the performance of the FIR filter. So, a modified Wallace Tree Multiplier is suggested since it reduces area and it is faster than other conventional multipliers. The proposed low areacost FIR filter using a modified Wallace Tree Multiplier. A direct form filter is such that at each clock cycle a new data sample and the corresponding filter coefficient can be applied to the multipliers inputs. x [n] is given as the input signal. DFFs are used as the delay elements. Modified Wallace Tree Multiplier block is provided for multiplying the input signal with the set of filter coefficients corresponding to the selected filter order. Then, modified Wallace Tree Multiplier block will provide the output signal.
FIGURE.1. MODIFIED WALLACE TREE MULTIPLIER BASED FIR FILTER
Multipliers play an important role in todays digital signal processing and various other applications. Essential design targets of multiplier include high speed, low power consumption, regularity of layout and hence less area or even combination of them in one multiplier are required thereby making them suitable for various VLSI implementations.
The straightforward way to implement a Multiplication is based on an iterative adderaccumulator for the generated partial products as shown in Figure.1.This Multiplier are called a serial multiplier.
FIGURE.2. TERATIVEMULTIPLIER
However, this solution is quite slow as the final result is only available after n clock cycles, n is the size of the operands. Serial multipliers are used where area and power are of utmost importance and increased delay can be tolerated. A faster version of the iterative multiplier should add partial products at once. This could be achieved by unfolding the iterative multiplier and yielding a combinational circuit that consists of several partial product generators together with several adders that operate in parallel. This multiplier is called Parallel Multiplier and is shown in Figure.2.The product is the result of multiplying the multiplicand to the multiplier. The multiplication operation is performed in two main steps. First is the partial product formation, which consists of ANDing each bit of the multiplier with the multiplicand. Each successive partial product has a relative shift of one bit position to the left of the previous partial product.
FIGURE. 3. PARALLEL MULTIPLIER
The second step is the partial product accumulation, where the partial product is combined to find the result most multiplication techniques can be classified as Array multipliers and Tree multipliers. A detailed discussion on the different types of multipliers is done in the following sections
IV RESULT AND DISCUSSION
The designs are done in TANNER SEDIT 13. Tool and are simulated using TANNER TSPICE 13.0 tool. The FIR Filters are designed with multipliers. The multipliers are internally designed using different type of adder units. Modified Wallace Tree Multiplier, bran multiplier and Vedic multiplier using FIR Filters result as be discussed. Various multipliers which are used in VLSI are discussed here. Each Multiplier is advantageous in specific field of application. Vedic Multiplier is suitable for all kinds of multiplication the direct form has 48% and the Transposed form has 45% reduction in count. Hence it can be concluded that the performance of the proposed 4×4 bit Vedic multiplier seems to be highly efficient in terms of speed when compared to Conventional multipliers. Reducing the time delay is very essential requirement for many applications and Vedic Multiplication technique is very much suitable for this purpose. The idea proposed here may set path for future research in this direction.
FIGURE.3. SCHEMATIC OF 4X4 BIT BRAUN MULTIPLIER
The structure consists of adders arranged in the iterative way and array of AND gates. This can be called as non additive multipliers.
FIGURE.4. SIMULATION RESULT OF 4X4 BIT BRAUN MULTIPLIER
In the internal structure, two main things are that each products can be generated in parallel with the AND gates, and each partial product can be added with the sum of partial product which has previously produced by using the row of adders
FIGURE.5. SCHEMATIC OF 4X4 BIT VEDIC MULTIPLIER
FIGURE.6. SIMULATION RESULT OF 4X4 BIT VEDIC MULTIPLIER
We have used 4 input adders in which it adds 4 bit at a time and gives two bit carry and 1 bit sum. First partial products are obtained using 2×2 Vedic multiplier, the partial product obtained from LSB 2x2multiplier whose output is Q0(3:0).
FIGURE.7.SCHEMATIC OF 4X4 BIT WALLACE TREE MULTIPLIER
Then its two LSB bits {Q0[1:0]} is directly equal to output last two LSB bits which is p[1:0], the remaining bits Q0(3:2) gets added with Q1(1:0) , Q2(1:0) and carry then Q3(1:0) gets added with Q1(3:2) , Q2(3:2) and carry and at last Q3(3:2) gets added with carry. And the additions are performed using two different adders carry save adder and full adder area and delay is obtained.
FIGURE.8. SIMULATION RESULT OF 4X4 BIT WALLACE TREE MULTIPLIER
The Generalized Comparison Of Various Multipliers Is Shown In The Table No 21 The Lowest Delay Is Of Wallace Multiplier. The Least Areas Are Of Modified Booth And Vedic Multipliers. And The Lowest Power Is Of Vedic Multiplier. However Modified Bran Is Also a Power Efficient Multiplier. Braun Multiplier Consider the
multiplication two unsigned 4 bit numbers A= a3, a2, a1, a0. Multiplier is given by B=b3, b2, b1, b0 Then product will be P=P7, P6, P5, P4, P3P2, P1, P0. A Braun
multiplier is m x n parallel multiplier. It is also known as carry save multiplier. .
FIGURE.9.SCHEMATIC OF DIRECT FORM FIR FILTER WITH BRAUN MULTIPLIER
FIGURE.10. SIMULATION RESULT OF DIRECT FORM FIR FILTER WITH BRAUN MULTIPLIER
The architecture of 4 x 4 Braun Multiplier array consists of (n1) rows of carry save adders, in that each row has (n1) full adders, the last row contains ripple adder for the propagation of carry.
FIGURE.11. SCHEMATIC OF TRANSPOSED FORM FIR FILTER WITH BRAUN MULTIPLIER
FIGURE.12.SIMULATION RESULT OF TRANSPOSED FORM FIR FILTER WITH BRAUN MULTIPLIER
Multiply (that is AND) each bit of one of the arguments, by each bit of the other, yielding n2 results. Depending on position of the multiplied bits, the wires carry different weights, for example wire of bit carrying result of a4 b3 is 128 (see explanation of weights below).
FIGURE.13. SCHEMATIC OF DIRECT FORM FIR FILTER WITH WALLACE TREE MULTIPLIER
Reduce the number of partial products to two by layers of full and half adders. Group the wires in two numbers, and add them with a conventional adder.
FIGURE.14 SIMULATION RESULT OF DIRECT FORM FIR FILTER WITH WALLACE TREE MULTIPLIER
FIGURE.15. SCHEMATIC OF TRANSPOSED FORM FIR FILTER WITH WALLACE TREE MULTIPLIER
FIGURE.16 SIMULATION RESULT OF TRANSPOSEDFORM FIR FILTER WITH WALLACE TREE MULTIPLIER
The second step works as follows. As long as there are three or more wires with the same weight add a following layer, Take any three wires with the same weights and input them into a full adder. The result will be an output wire of the same weight and an output wire with a higher weight for each three input wires. If there are two wires of the same weight left, input them into a half adder .If there is just one wire left, connect it to the next layer
FIGURE.17. SCHEMATIC OF DIRECT FORM FIR FILTER WITH VEDIC MULTIPLIER
FIGURE.18. SIMULATION RESULT OF DIRECT FORM FIR FILTER WITH VEDIC MULTIPLIER
4×4 bit multiplication, say multiplicand a=a3a2a1a0 and multiplier b=b3b2b1b0. Following are the output line for the multiplication result, s7s6s5s4s3s2s1s0. lets divide a and b into two parts, say a3 a2 & a1 a0 for a and b3 b2 & b1b0 for b. using the fundamental of Vedic multiplication, taking two bit at a time and using 2 bit multiplier block, we can have the following structure for 4×4 bit multiplication as shown .figure 5.15: schematic of direct form fir filter with Vedic multiplier
FIGURE.19. SCHEMATIC OF TRANSPOSED FORM FIR FILTER WITH VEDIC MULTIPLIER
FIGURE.20. SIMULATION RESULT OF TANSPOSED FORM FIR FILTER WITH VEDIC MULTIPLIER
Here we compared Braun, Vedic and Wallace multipliers based on three parameters TIME/DELAY, POWER and AREA. The power is calculated for Power the table 1 shows the comparisons in Power in ns.
TABLE I.POWER COMPARISON
Type of Multiplier
Power (nw)
Delay (ns)
Braun Multiplier
1.458
7.168
Wallace Multiplier
1.491
13.452
Vedic Multiplier
1.562
12.081

AREA ANALYSIS
From the result obtained from various multipliers, it is observed that for 4*4 multiplier Braun multiplier have very less area but as we go to higher bit of multiplier and multiplicand 4* 4 multiplier provides very less area. So no. of bits of multiplier decides the multiplier selection based on area.
FIGURE.21. POWER COMPARISON CHART

DELAY ANALYSIS

From the result, it is observed that Wallace multiplier is having highest delay despite of having moderate area while r Braun multiplier have very less delay so we use 4*4 Braun multiplier in the application where high speed required table 2
FIGURE.22. DELAY COMPARISON CHART
TABLE.2 POWER AND TRANSISTOR COUNT COMPARISON
Length 
Multiplier Types 
Power(nw) 
Transistor count 
direct Form 
Braun 
1.458 
1572 
Wallace 
1.491 
1624 

Vedic 
1.562 
2324 

Braun 
2.08 
1688 

Transposed 
Wallace 
2.93 
1740 
Form 
Vedic 
2.11 
2447 
FIGURE.23. POWER AND TRANSISTOR COUNT COMPARISON CHART
The above graphical representation gives the information about Comparison of proposed 4bit Braun Multiplier with the conventional 4bit Multiplier in terms of Power Dissipation and Delay. It can be observed from the above graphs from Fig. to Table 4 for the proposed Multiplier power reduction is 41% and delay reduction is 26%, when compared with Conventional Multiplier. With these comparisons we can observe that the lowpower & high speed is achieved for 4bit multiplier using Vedic and Braun Multipliers.
TABLE 3. MULTIPLIERSDELAY, AREA AND POWER COMPARISONS OF DIRECT FORM &TRANSPOSED FORM OF FIR FILTER
Filter Type 
Multiplier Types 
Power(nw) 
Width (bit) 
Delay (ns) 
Transistor count 
2Tap Direct Form 
BRAUN 
1.458 
4 
7.168 
1572 
WALLACE 
1.491 
4 
13.452 
1624 

VEDIC 
1.562 
4 
12.081 
2324 

2 Tap Transposed Form 
BRAUN 
2.08 
4 
6.89 
1688 
WALLACE 
2.09 
4 
12.8 
1740 

VEDIC 
2.11 
4 
11.6 
2447 
FIGURE.24. MULTIPLIERSDELAY, AREA AND POWER COMPARISONS OF DIRECT FORM &TRANSPOSED FORM OF FIR FILTER

CONCLUSION
This Paper presents different multiplier algorithms. Design improvements are made every day in the existing device for the best performance and efficiency. Vedic multiplier and Wallace multipliers and carry array multipliers using FIR Filters design. The result proves that the proposed architecture is more efficient than the existing design. The reduction in count is about 42% in direct form where as in Transposed form reduction is 30%. The next comparison has been done between Nonfolded filter and folded filter with multiplier Vedic multipliers using FIR filter is efficient and can be used in DSP applications and generic processors for faster computations.

REFERENCES

JinFaLin, YinTsung Hwang,A Novel HighSpeed and Energy Efficient 10 Transistor Full Adder Design Vol. 54, NO. 5, MAY 2007.

Manoharan, K. and Punitha, S., Low Power Delay Product 10T Adder Circuit, 2015

Punitha S and Manoharan K, Ultra Low PDP Modified 10T Adder Imperial Journal of Interdisciplinary Research (IJIR),
Volume 2, Issue 2, pp 15041508,2016

Punitha, S. and Manoharan, K., 2015. A Literature Survey on Low PDP Adder Circuits International Journal of Computer Science and Mobile Computing IJCSMC, 4(12), pp.289298.

E. Abu Shama and M. Bayou, A new cell for low power Adders, in Proc. Int. Midwest symp Circuits Syst., 1995, pp. 10141017.DSP Journal, Volume 9, Issue 1, June, 2009

T.Kowsalya, Tree Structured Arithmetic circuit by using different CMOS logic styles ICGSTPDCS, Volume 8, Issue 1, December 2008.

Deepak, Gamier, P.K.Sluzek,Performance Characteristics of Parallel and Pipelined Implementation of FIR Filters in FPGA Platform", in Signals, Circuits and Systems 2007. ISSCS2007. International Symposium on Publication Date: 1314 July 2007.

N. Zhuang and H. Wu, A new design of the CMOS full adder, IEEE J. SolidState Circuits, vol.27, no. 5, pp. 840 844, May1992.

J. Wang, S. Fang, and W. Feng, New efficient designs for XOR and XNOR functions on the transistor level, IEEE J.Solid State Circuits, vol.29, no. 7, pp. 780786, Jul. 1994.

Reto Zimmermann and Wolfgang Fichtner LowPower Logic Styles: CMOS versus Pass Transistor Logic IEEE Journal of SolidState Circuits, Vol.32, No.7, April 1997, pp.10791090.

Zhijun Huang, High level optimization techniques for low power Multiplier design 2003.

M. Kathirvelu, T. Mani gandam, Design of low power, High speed FIR Filter with optimized PDP Adders and FlipFlops for DSP Applications, European Journal of Scientific Research ISSN 1450216X Vol. 76 No.2 (2012), pp 214225.