Ultra Low Power Finite Impulse Response Filters (FIR)

Download Full-Text PDF Cite this Publication

Text Only Version

Ultra Low Power Finite Impulse Response Filters (FIR)

3

M. Sumesp, K. Manoharan2, S. Suguna

ME Scholar1, Assistant Professor2, 3,

Department of ECE, SVS College of Engineering, Coimbatore, Tamil Nadu, India

Abstract:- The present work represents the minimization of power and area. The delay is the one of the governing parameters in a digital signal processing circuits with Finite impulse Response filters (FIR) being an important element of it the architecture. FIR Filter contains basically three components viz multiplier, adder and delay unit. Multiplication is the reason of area delay and power consumption in digital circuits and in the present work the performance is dependent on multiplier in FIR Filters. To improve the efficiency of multiplier unit, its needed to optimize various parameters such as speed and area. The design of full adders for low power is obtained and low power units are implemented on the proposed multiplier and the results are analyzed for better performance and it optimized. The FIR filters are designed in both direct form method and transposed form method. The low power filters are simulated and implemented using Tanner EDA tool.

Index Terms: Finite Impulse Response Filters, Full Adder, Half Adder, Carry save, Braun Multiplier, Vedic Multiplier, Wallace Tree Multiplier.

I INTRODUCTION

Finite Impulse Response (FIR) filters are widely used in Digital Signal Processing (DSP) applications due to their stability and linear-phase property. In today scenario, low power consumption and less area are the most important parameter for the fabrication of DSP systems and high performance systems. Nowadays, many finite impulse response (FIR) filter designs aimed at either low area or high speed or reduced power consumption are developed. With the increase in area, hardware cost of these FIR Filters is increasing. This leads to design a low area FIR filter with the advantage of moderate speed performance. Adders are very important in variety of digital system, [1]

  1. [3] [4] [5] Many fast adder exits, but adding fast with low delay and power is still challenging. In many computers and other kind of processor adders are used not only in the arithmetic logic units, but also in other part of processor, where they are used to calculate addresses, table indices, and similar operations. On the basis of requirements such as area, delay and power consumption there are some complex adders such as Ripple Carry Adder, Carry look-Ahead Adder and Carry Select Adder. Ripple Carry Adder (RCA) shows the compact design but their computation time is longer.[6] [7] Applications that has time as a critical factor make use of Carry Look-Ahead

    Adder (CLA) to derive fast results but it leads to increase in area. But the carry select adder provides a compromise between the small areas but longer delay of RCA and large area with small delay of Carry Look Ahead adder.

    The implementation of an FIR filter requires three basic Building blocks. They are Multiplication, Addition and Signal delay. Multipliers consume the most amount of area in a FIR filter design. As the multiplier is the slowest element in the system, it will affect the performance of the FIR filter. [8] [9] [10] To reduce the power Consumption of FIR Filter, function of adder is minimized by a technique called scaling and

    Rounding-off Filter coefficient and truncation of unnecessary bits. Evaluation of power, area and speed for different types of adders and multipliers will be carried out and the FIR filter will be design with optimized combination of adders and multipliers for low power and high speed application

    Filters are very important part of digital signal processing applications. Filters have 2 uses, one is signal separation and other one is signal restoration. In general filtering is described simple convolution operation such as, [11] [12] Where L is the length of FIR filter(n) is the filter impulse response coefficients, x(n) is input sequence and y(n) is the output of FIR filter. The above equations can also expressed in Z domain as Y(z)=x(z)H(z). We present two high-speed and low-power Full-adder cells designed with an alternative internal logic structure and pass-transistor logic styles that lead to have a reduced power-delay product (PDP).

    II VARIOUS MULTIPLIERS

    • Braun Multiplier

      Braun multiplier is a one of the parallel multipliers. It is also referred to as Carry Save Multiplier. Full adders and AND gates are present in the architecture of this multiplier. AND gates are connected in parallel; the partial products are obtained as result of it. Each partial product is summed up with the sum of previously produced partial products utilizing adders. Generally, N X N Braun multipliers consists of n2 AND gates and N (N-1) full adders. The

      drawback of this multiplier is that the number of components used in building the multiplier increases four times with the increase in the number of bits.

    • Array Multipliers

      Array multipliers can be implemented by directly mapping the manual multiplication into hardware. The partial products are accumulated by an array of adder circuits. An n x n array multiplier requires n (n-1) adders and n2 AND gates. Array multipliers have a large critical path and are very slow. The main advantage of these multipliers is the regular structure which leads to ease of layout and design.

    • Carry saves Array Multiplier

      The carry-save array multiplier uses an array of carry-save adders for the accumulation of partial product. It uses a carry-propagate adder for the generation of the final product. This reduces the critical path delay of the multiplier since the carry-save adders pass the carry to the next level of adders rather than the adjacent ones.

      • Wallace Tree Multiplier

  1. S. Wallace (1964) propounded a fast technique to perform multiplication. A Wallace tree multiplier offers faster performance for large operands. Unlike an array multiplier the partial product matrix for a tree-multiplier is rearranged in a tree-like format, reducing both the critical path and the number of adder cells needed. The Wallace tree consists of numerous levels of such column compression structures until finally, only two full-width operands remain. These two operands can then be added using fast carry-propagate adder to obtain the product result. Figure 3.8 shows a 4 X 4 Wallace tree Multiplier that produces a 8-bit output. What differentiates the Wallace tree multiplier from other column compression multipliers is that in the Wallace tree every possible bit in every column is covered by the (3:2) or (2:2). There are numerous ways to implement a Wallace tree multiplier.

    • Vedic Multiplier

      Vedic Mathematics hails from the ancient Indian scriptures called Vedas or the source of knowledge. This system of computation covers all forms of mathematics, be it geometry, trigonometry or algebra. Vedic mathematics is part of four Vedas (books of wisdom). It is part of S thapatya- Veda (book on civil engineering and architecture), which is an unpaved (supplement) of Atharva Veda. It covers explanation of several modern mathematical terms including arithmetic, geometry (plane, co-ordinate), tri goniometry, quadratic equations, factorization and even calculus. In Vedic mathematics there are 16 sutras (formulae) and 16 Upa sutras (sub formulae). Among sutras three are used for multiplication. Urdhava Tiryakbhyam is a Sanskrit word which means vertically and crosswire in English. The method is a general multiplication formula applicable to all caes of multiplication. The method can be generalized for any N x N bit multiplication. Vedic Multiplier for 4×4 bit Divide the no. of bits in the inputs equally in two parts. Lets analyze 4×4 bit multiplication, say multiplicand A=A3A2A1A0 and multiplier B= B3B2B1B0. Following

      are the output line for the multiplication result, S7S6S5S4S3S2S1S0.Lets divide A and B into two parts, say A3 A2 & A1 A0 for A and B3 B2 & B1B0 for B.

      Using the fundamental of Vedic multiplication, taking two bit at a time and using 2 bit multiplier block.

      III PROPOSED SYSTEM

      As the multiplier is the slowest element in the system, it will affect the performance of the FIR filter. So, a modified Wallace Tree Multiplier is suggested since it reduces area and it is faster than other conventional multipliers. The proposed low area-cost FIR filter using a modified Wallace Tree Multiplier. A direct form filter is such that at each clock cycle a new data sample and the corresponding filter coefficient can be applied to the multipliers inputs. x [n] is given as the input signal. D-FFs are used as the delay elements. Modified Wallace Tree Multiplier block is provided for multiplying the input signal with the set of filter coefficients corresponding to the selected filter order. Then, modified Wallace Tree Multiplier block will provide the output signal.

      FIGURE.1. MODIFIED WALLACE TREE MULTIPLIER BASED FIR FILTER

      Multipliers play an important role in todays digital signal processing and various other applications. Essential design targets of multiplier include high speed, low power consumption, regularity of layout and hence less area or even combination of them in one multiplier are required thereby making them suitable for various VLSI implementations.

      The straightforward way to implement a Multiplication is based on an iterative adder-accumulator for the generated partial products as shown in Figure.1.This Multiplier are called a serial multiplier.

      FIGURE.2. TERATIVEMULTIPLIER

      However, this solution is quite slow as the final result is only available after n clock cycles, n is the size of the operands. Serial multipliers are used where area and power are of utmost importance and increased delay can be tolerated. A faster version of the iterative multiplier should add partial products at once. This could be achieved by unfolding the iterative multiplier and yielding a combinational circuit that consists of several partial product generators together with several adders that operate in parallel. This multiplier is called Parallel Multiplier and is shown in Figure.2.The product is the result of multiplying the multiplicand to the multiplier. The multiplication operation is performed in two main steps. First is the partial product formation, which consists of AND-ing each bit of the multiplier with the multiplicand. Each successive partial product has a relative shift of one bit position to the left of the previous partial product.

      FIGURE. 3. PARALLEL MULTIPLIER

      The second step is the partial product accumulation, where the partial product is combined to find the result most multiplication techniques can be classified as Array multipliers and Tree multipliers. A detailed discussion on the different types of multipliers is done in the following sections

      IV RESULT AND DISCUSSION

      The designs are done in TANNER S-EDIT 13. Tool and are simulated using TANNER T-SPICE 13.0 tool. The FIR Filters are designed with multipliers. The multipliers are internally designed using different type of adder units. Modified Wallace Tree Multiplier, bran multiplier and Vedic multiplier using FIR Filters result as be discussed. Various multipliers which are used in VLSI are discussed here. Each Multiplier is advantageous in specific field of application. Vedic Multiplier is suitable for all kinds of multiplication the direct form has 48% and the Transposed form has 45% reduction in count. Hence it can be concluded that the performance of the proposed 4×4 bit Vedic multiplier seems to be highly efficient in terms of speed when compared to Conventional multipliers. Reducing the time delay is very essential requirement for many applications and Vedic Multiplication technique is very much suitable for this purpose. The idea proposed here may set path for future research in this direction.

      FIGURE.3. SCHEMATIC OF 4X4 BIT BRAUN MULTIPLIER

      The structure consists of adders arranged in the iterative way and array of AND gates. This can be called as non additive multipliers.

      FIGURE.4. SIMULATION RESULT OF 4X4 BIT BRAUN MULTIPLIER

      In the internal structure, two main things are that each products can be generated in parallel with the AND gates, and each partial product can be added with the sum of partial product which has previously produced by using the row of adders

      FIGURE.5. SCHEMATIC OF 4X4 BIT VEDIC MULTIPLIER

      FIGURE.6. SIMULATION RESULT OF 4X4 BIT VEDIC MULTIPLIER

      We have used 4 input adders in which it adds 4 bit at a time and gives two bit carry and 1 bit sum. First partial products are obtained using 2×2 Vedic multiplier, the partial product obtained from LSB 2x2multiplier whose output is Q0(3:0).

      FIGURE.7.SCHEMATIC OF 4X4 BIT WALLACE TREE MULTIPLIER

      Then its two LSB bits {Q0[1:0]} is directly equal to output last two LSB bits which is p[1:0], the remaining bits Q0(3:2) gets added with Q1(1:0) , Q2(1:0) and carry then Q3(1:0) gets added with Q1(3:2) , Q2(3:2) and carry and at last Q3(3:2) gets added with carry. And the additions are performed using two different adders carry save adder and full adder area and delay is obtained.

      FIGURE.8. SIMULATION RESULT OF 4X4 BIT WALLACE TREE MULTIPLIER

      The Generalized Comparison Of Various Multipliers Is Shown In The Table No 21 The Lowest Delay Is Of Wallace Multiplier. The Least Areas Are Of Modified Booth And Vedic Multipliers. And The Lowest Power Is Of Vedic Multiplier. However Modified Bran Is Also a Power Efficient Multiplier. Braun Multiplier Consider the

      multiplication two un-signed 4- bit numbers A= a3, a2, a1, a0. Multiplier is given by B=b3, b2, b1, b0 Then product will be P=P7, P6, P5, P4, P3P2, P1, P0. A Braun

      multiplier is m x n parallel multiplier. It is also known as carry save multiplier. .

      FIGURE.9.SCHEMATIC OF DIRECT FORM FIR FILTER WITH BRAUN MULTIPLIER

      FIGURE.10. SIMULATION RESULT OF DIRECT FORM FIR FILTER WITH BRAUN MULTIPLIER

      The architecture of 4 x 4 Braun Multiplier array consists of (n-1) rows of carry save adders, in that each row has (n-1) full adders, the last row contains ripple adder for the propagation of carry.

      FIGURE.11. SCHEMATIC OF TRANSPOSED FORM FIR FILTER WITH BRAUN MULTIPLIER

      FIGURE.12.SIMULATION RESULT OF TRANSPOSED FORM FIR FILTER WITH BRAUN MULTIPLIER

      Multiply (that is AND) each bit of one of the arguments, by each bit of the other, yielding n2 results. Depending on position of the multiplied bits, the wires carry different weights, for example wire of bit carrying result of a4 b3 is 128 (see explanation of weights below).

      FIGURE.13. SCHEMATIC OF DIRECT FORM FIR FILTER WITH WALLACE TREE MULTIPLIER

      Reduce the number of partial products to two by layers of full and half adders. Group the wires in two numbers, and add them with a conventional adder.

      FIGURE.14 SIMULATION RESULT OF DIRECT FORM FIR FILTER WITH WALLACE TREE MULTIPLIER

      FIGURE.15. SCHEMATIC OF TRANSPOSED FORM FIR FILTER WITH WALLACE TREE MULTIPLIER

      FIGURE.16 SIMULATION RESULT OF TRANSPOSEDFORM FIR FILTER WITH WALLACE TREE MULTIPLIER

      The second step works as follows. As long as there are three or more wires with the same weight add a following layer, Take any three wires with the same weights and input them into a full adder. The result will be an output wire of the same weight and an output wire with a higher weight for each three input wires. If there are two wires of the same weight left, input them into a half adder .If there is just one wire left, connect it to the next layer

      FIGURE.17. SCHEMATIC OF DIRECT FORM FIR FILTER WITH VEDIC MULTIPLIER

      FIGURE.18. SIMULATION RESULT OF DIRECT FORM FIR FILTER WITH VEDIC MULTIPLIER

      4×4 bit multiplication, say multiplicand a=a3a2a1a0 and multiplier b=b3b2b1b0. Following are the output line for the multiplication result, s7s6s5s4s3s2s1s0. lets divide a and b into two parts, say a3 a2 & a1 a0 for a and b3 b2 & b1b0 for b. using the fundamental of Vedic multiplication, taking two bit at a time and using 2 bit multiplier block, we can have the following structure for 4×4 bit multiplication as shown .figure 5.15: schematic of direct form fir filter with Vedic multiplier

      FIGURE.19. SCHEMATIC OF TRANSPOSED FORM FIR FILTER WITH VEDIC MULTIPLIER

      FIGURE.20. SIMULATION RESULT OF TANSPOSED FORM FIR FILTER WITH VEDIC MULTIPLIER

      Here we compared Braun, Vedic and Wallace multipliers based on three parameters TIME/DELAY, POWER and AREA. The power is calculated for Power the table 1 shows the comparisons in Power in ns.

      TABLE I.POWER COMPARISON

      Type of Multiplier

      Power (nw)

      Delay (ns)

      Braun Multiplier

      1.458

      7.168

      Wallace Multiplier

      1.491

      13.452

      Vedic Multiplier

      1.562

      12.081

    • AREA ANALYSIS

      From the result obtained from various multipliers, it is observed that for 4*4 multiplier Braun multiplier have very less area but as we go to higher bit of multiplier and multiplicand 4* 4 multiplier provides very less area. So no. of bits of multiplier decides the multiplier selection based on area.

      FIGURE.21. POWER COMPARISON CHART

    • DELAY ANALYSIS

From the result, it is observed that Wallace multiplier is having highest delay despite of having moderate area while r Braun multiplier have very less delay so we use 4*4 Braun multiplier in the application where high speed required table 2

FIGURE.22. DELAY COMPARISON CHART

TABLE.2 POWER AND TRANSISTOR COUNT COMPARISON

Length

Multiplier Types

Power(nw)

Transistor count

direct Form

Braun

1.458

1572

Wallace

1.491

1624

Vedic

1.562

2324

Braun

2.08

1688

Transposed

Wallace

2.93

1740

Form

Vedic

2.11

2447

FIGURE.23. POWER AND TRANSISTOR COUNT COMPARISON CHART

The above graphical representation gives the information about Comparison of proposed 4-bit Braun Multiplier with the conventional 4-bit Multiplier in terms of Power Dissipation and Delay. It can be observed from the above graphs from Fig. to Table 4 for the proposed Multiplier power reduction is 41% and delay reduction is 26%, when compared with Conventional Multiplier. With these comparisons we can observe that the low-power & high speed is achieved for 4-bit multiplier using Vedic and Braun Multipliers.

TABLE 3. MULTIPLIERSDELAY, AREA AND POWER COMPARISONS OF DIRECT FORM &TRANSPOSED FORM OF FIR FILTER

Filter Type

Multiplier Types

Power(nw)

Width (bit)

Delay (ns)

Transistor count

2-Tap Direct Form

BRAUN

1.458

4

7.168

1572

WALLACE

1.491

4

13.452

1624

VEDIC

1.562

4

12.081

2324

2- Tap Transposed Form

BRAUN

2.08

4

6.89

1688

WALLACE

2.09

4

12.8

1740

VEDIC

2.11

4

11.6

2447

FIGURE.24. MULTIPLIERSDELAY, AREA AND POWER COMPARISONS OF DIRECT FORM &TRANSPOSED FORM OF FIR FILTER

  1. CONCLUSION

    This Paper presents different multiplier algorithms. Design improvements are made every day in the existing device for the best performance and efficiency. Vedic multiplier and Wallace multipliers and carry array multipliers using FIR Filters design. The result proves that the proposed architecture is more efficient than the existing design. The reduction in count is about 42% in direct form where as in Transposed form reduction is 30%. The next comparison has been done between Non-folded filter and folded filter with multiplier Vedic multipliers using FIR filter is efficient and can be used in DSP applications and generic processors for faster computations.

  2. REFERENCES

  1. Jin-FaLin, Yin-Tsung Hwang,A Novel High-Speed and Energy Efficient 10- Transistor Full Adder Design Vol. 54, NO. 5, MAY 2007.

  2. Manoharan, K. and Punitha, S., Low Power Delay Product 10T Adder Circuit, 2015

  3. Punitha S and Manoharan K, Ultra Low PDP Modified 10T Adder Imperial Journal of Interdisciplinary Research (IJIR),

    Volume 2, Issue 2, pp 1504-1508,2016

  4. Punitha, S. and Manoharan, K., 2015. A Literature Survey on Low PDP Adder Circuits International Journal of Computer Science and Mobile Computing IJCSMC, 4(12), pp.289-298.

  5. E. Abu Shama and M. Bayou, A new cell for low power Adders, in Proc. Int. Midwest symp Circuits Syst., 1995, pp. 10141017.DSP Journal, Volume 9, Issue 1, June, 2009

  6. T.Kowsalya, Tree Structured Arithmetic circuit by using different CMOS logic styles ICGSTPDCS, Volume 8, Issue 1, December 2008.

  7. Deepak, Gamier, P.K.Sluzek,Performance Characteristics of Parallel and Pipelined Implementation of FIR Filters in FPGA Platform", in Signals, Circuits and Systems 2007. ISSCS2007. International Symposium on Publication Date: 13-14 July 2007.

  8. N. Zhuang and H. Wu, A new design of the CMOS full adder, IEEE J. Solid-State Circuits, vol.27, no. 5, pp. 840 844, May1992.

  9. J. Wang, S. Fang, and W. Feng, New efficient designs for XOR and XNOR functions on the transistor level, IEEE J.Solid State Circuits, vol.29, no. 7, pp. 780786, Jul. 1994.

  10. Reto Zimmermann and Wolfgang Fichtner Low-Power Logic Styles: CMOS versus Pass- Transistor Logic IEEE Journal of Solid-State Circuits, Vol.32, No.7, April 1997, pp.10791090.

  11. Zhijun Huang, High level optimization techniques for low power Multiplier design 2003.

  12. M. Kathirvelu, T. Mani gandam, Design of low power, High speed FIR Filter with optimized PDP Adders and Flip-Flops for DSP Applications, European Journal of Scientific Research ISSN 1450-216X Vol. 76 No.2 (2012), pp 214-225.

Leave a Reply

Your email address will not be published. Required fields are marked *