Optimal Multi-objective Approach for VLSI Implementation of Digital FIR Filters

DOI : 10.17577/IJERTV3IS21230

Download Full-Text PDF Cite this Publication

Text Only Version

Optimal Multi-objective Approach for VLSI Implementation of Digital FIR Filters

Mr. Jitesh R. Shinde

Research Scholar & IEEE member, Electronics Engg. dept., Priyadarshini College of Engg.,

Nagpur, India

Prof. (Dr.) S. S. Salankar

Electronics & Communication Engg. dept., G.H.Raisoni College of Engg.

Nagpur, India

=

=0

Abstract Filters are heart of any Digital Signal Processing (DSP) based system wherein, multipliers and adders are the basic

The Z-transform of the impulse response yields the transfer function of the FIR filter i.e.

component in Finite Impulse Response (FIR) filters. So, VLSI

implementation of DSP systems performance is generally determined by the performance of the multipliers and adders. Multiplier is generally the slowest element in the system. Furthermore, it is generally the most area consuming. But optimizing the speed and area of the multiplier is a major design issue because improving speed results mostly in larger areas.

So, in this paper, an optimal multi-objective approach for VLSI implementation of digital FIR filters is suggested, wherein the three main design constraints viz. area, speed and power are optimized simultaneously without affecting the functionality of design.

= =

= 1 ; (1.3)

Keywords FIR filter, direct form FIR filter, transpose form FIR filter, MAC, Multiplier, MCM, SCM.

  1. INTRODUCTION

    Finite impulse response (FIR) filters are of great importance in digital signal processing (DSP) systems since their characteristics in linear-phase and feed-forward implementations make them very useful for building stable high-performance filters.

    Fig. 1.1: A discrete-time FIR filter of order N. The top part is an N-stage delay line with N + 1 taps

    FIR filters are clearly bounded-input bounded- output (BIBO) stable, since the output is a sum of a finite number of finite multiples of the input values, so can be no greater than | | times the largest value appearing in the input [2].

    Physically, a discrete system (FIR filters) is realized or implemented either as a digital hardware or as software on a digital hardware. The processing of the discrete time signal by the digital hardware involves mathematical operations like addition, multiplication and delay.

    In signal processing, a finite impulse response (FIR) filter is a filter whose impulse response (or response to any finite length input) is of finite duration, because it settles to zero in finite time. This is in contrast to infinite impulse response (IIR) filters, which may have internal feedback and may continue to respond indefinitely (usually decaying). [1]

    The time domain representation of Nth order FIR system is,

  2. WHY FIR FILTERS

    An FIR filter has a number of useful properties which sometimes make it preferable to an infinite impulse response (IIR) filter. FIR filters:

    1. Require no feedback. This means that any rounding errors are not compounded by summed iterations. The same relative error occurs in each calculation. This also makes implementation simpler.

    2. Are inherently stable. This is due to the fact that, because there is no required feedback, all the poles are located at the origin and thus are located within the unit circle (the required condition for

      =0

      =

      (1.1)

      stability in a discrete, linear-time invariant system).

      The impulse response h[n] of digital FIR filter can be calculated if we set x[n] = [] in the above relation, where [] is the Kronecker delta impulse. The impulse response for an FIR filter then becomes the set of coefficients bn, as follows:

    3. They can easily be designed to be linear phase by making the coefficient sequence symmetric; linear phase, or phase change proportional to frequency, corresponds to equal delay at all frequencies. This property is sometimes desired for phase-sensitive

    =0

    =

    = ; For n=0 to N (1.2)

    applications, for example data communications, crossover filters, and mastering.

  3. BASIC BUILDING BLOCKS OF FIR FILTERS

    1. MAC Unit

      From figure 1, it is seen that the critical operations usually involve are many multiplications and/or accumulations. Hence for real-time signal processing, a high speed and high throughput Multiplier-Accumulator (MAC) is always a key to achieve a high performance digital signal processing system.

      A conventional MAC unit consists of multiplier and an accumulator that contains the sum of the previous consecutive products. The function of the MAC unit is given by the following equation:

      F = Ai Bi (3.1)

      Fig. 3.1.: Basic structure of MAC

    2. ADDER UNIT

    In electronics, an adder is a digital circuit that performs addition of numbers. In modern computers adders reside in the arithmetic logic unit (ALU) where other operations are performed. Depending on the area, delay and power consumption requirements, several adder implementations have been proposed. Ripple Carry Adders with the most compact design (O (n) area) among all types of adders, are the slowest in speed (O (n) time). Carry Select Adders (O (n) time) and (O (2n) area) are in between RCAs and CLAs (O (n) time) and (O (n log n) area) thus providing an optimum solution between the area-efficient RCAs and the shortest-delay CLAs.

  4. DESIGN ISSUES

    The main goal of a DSP processor design is to enhance the speed of the MAC unit, and at the same time limit the power consumption and number of gates (or area).

    There are three sources of power dissipation in CMOS circuits: switching power Psw, short-circuit power Psc, and leakage power Pleakage. Psw is often the most significant source, therefore efforts to reduce the power consumed in FIR system realizations, focus on reducing Psw. Since multiplications represent the most complex task in FIR computations, a lot of research has been carried out on reducing the complexity of or totally eliminating multiplications in computing the product terms in Eqn. (1) [10].

    FIR filters have a large number of multiplications involved in the filter algorithm, which are usually implemented in floating point arithmetic (IEEE 754 double-precision binary floating-point format: binary32).

    The floating point number system can accommodate a large range of numbers and so in floating point arithmetic higher accuracy in processing can be achieved. But the hardware implementation for floating point arithmetic is costlier and the speed of processing is low due to double calculations i.e., separate calculation for mantissa and exponent. In this arithmetic, the truncation and rounding errors occur both for multiplication and addition, whereas in fixed point arithmetic such errors occur only for multiplication. The addition in fixed point arithmetic leads to overflow, but the overflow is rare phenomena in floating point arithmetic due to larger dynamic range. Therefore, the floating point arithmetic is preferred for non-real time applications on general purpose systems (computers) in which the cost and speed are not significant and fixed point arithmetic is preferred due to the reduced cost of the hardware and high speed processing [12].

  5. MULTI-OBJECTIVE PROBLEM FORMULATION Multi-objective optimization involves minimizing or

    maximizing multiple objective functions subject to a set of constraints. Example problems include analyzing design tradeoffs, selecting optimal product or process deigns, or any other application where you need an optimal solution with tradeoffs between two or more conflicting objectives.

    In VLSI implementation of digital FIR filters, design constraints which influence the performance of FIR filters are area, power and delay. But main hurdle in VLSI implementation of digital FIR system is that either design can be area efficient or power efficient or speed efficient; but not all area-time-speed efficient simultaneously. Optimizing one parameter affects the others.

    So, the objective of this research work is to come up with step by step an optimal multi-objective approach for VLSI implementation of digital filters wherein all constraints viz. area, power and time are optimized simultaneously.

  6. DESIGN APPROACH

    1. Obtain Transposed Form of FIR Filter of figure 1.1.

      Fig. 5.1.: FIR Filter Transposed Form

      Advantages of transposed form are:-

      1. Computationally equivalent to direct form.

      2. Can be obtained by reversing order of final addition followed by retiming. Now, all multiplications share one input.

      3. The direct-form structure has the disadvantage that each adder has to wait for the previous adder to finish

        before it can compute its result. For high speed hardware such as FPGAs/ASICs, this introduces latency which limits how fast the filter can be clocked. A solution to this is to use the transposed direct-form structure instead. With this structure, the delays between the adders can be used for pipelining purposes and therefore all additions/multiplications can be performed in fully parallel fashion. This allows real-time handling of data with very high sampling frequencies and also provides a solution to optimize the speed of the system.

  7. MULTIPLICATION

    Multiplication in digital FIR designs often involves the multiplication by constant coefficients as shown in figure 1.1. The shift and add loop of traditional multipliers can be replaced with a set of high speed wire-shifts and then added in one quick step while still fulfilling the same binary multiplication shown in Equation 2.1.

    Figure 5.2: Example of SCM approach based multiplier design

    of nonzero terms within the discrete coefficients as each nonzero term corresponds to an additional adder in the hardware implementation. Depending on the target hardware, it may be possible to implement a linear-phase FIR filter using less multipliers than the minimum-phase filter by taking advantage of the symmetry even if the filter length of the linear-phase is larger [3,4].

    k =

    n i=0

    2i ki; (2.1)

    For the bit-parallel design of the MCM operation, the MCM problem is defined as finding the fewest number of

    This optimization is sometimes referred to as

    multiplierless design, although the shift and add structure created does still implement a multiplier. Single constant multiplication (SCM) is also a term that is used to describe the optimized constant multipliers. In hardware, the multiplication operation is considered to be expensive, as it occupies significant area. Hence, constant multiplications are generally realized using only addition, subtraction, and shift operations [5].

    The logic for obtaining the shift and add structure of an SCM (figure 5.2) is to first convert the constant multiplicand into its binary form. For example, the constant (43)10 is converted to (101011)2. Then to multiply x by (43)10, shift x by a set amount for each 1 digit in the binary encoding. The amount of the shift is determined by the order of magnitude of that particular bit position. For (43)10, the MSB of the binary encoding is a 1, so x needs to be shifted left by five because the MSB has the magnitude of 32. The final step is to then add all of shifted values to compute the product.

    The number of 2-input additions necessary to perform the constant multiplication is the number of nonzero digits of the binary representation minus one. The example coefficient (43)10, (101011)2, requires three adders to form a product because there are four nonzero digits. While this optimization for constant multiplications is useful, it is not optimal.

    The multiplier block of the digital FIR filter in its transposed form [Fig. 5.1], where the multiplication of filter coefficients with the filter input is realized, has significant impact on the complexity and performance of the design because a large number of constant multiplications are required. This is generally known as the multiple constant multiplications (MCM) operation. The goal is the minimization

    addition and subtraction operations that realize the MCM, since shifts can be implemented using only wires in hardware.

    Many efficient algorithms [6, 7] have been introduced for the MCM problem. In spite of various methods they use and different search space they explore, the main idea has always been the maximization of the sharing of common partial products among the constant multiplications. As an example, consider the constant multiplications 29x and 43x. Observe from Figure 5(a)-(b) that the sharing of partial products 3x and 5x reduces the number of operations from 6 to 4. The same sharing of partial products approach has been used in our transposed form structure [11]. Thus, when using MCM instead of SCM, an added savings can be accomplished by reusing fundamentals between the constants.

    Fig. 5.3: Shift-adds implementations of 29x and 43x (a) without partial product sharing; (b) with partial product sharing.

    Many efficient algorithms [6, 7] have been introduced for the MCM problem. In spite of various methods they use and different search space they explore, the main idea has always been the maximization of the sharing of common partial products among the constant multiplications. As an example, consider the constant multiplications 29x and 43x. Observe from Figure (a)-(b) that the sharing of partial products 3x and 5x reduces the number of operations from 6 to 4.

  8. COMPARISON AND SIMULATION RESULT

    In our work, we had designed three LTI filters viz. filter 1 (direct form), filter 2 (transposed form) and filter 3 (Optimized direct form) and then their performance was compared with respect to area, dynamic power dissipation and propagation delay.

    Firstly, simple direct form FIR filter structure was implementated in MATLAB using FDA (Filter Design & Analysis) tool of MATLAB with following specifications:-

        • Design Method :- FIR equiripple

        • Response type :- Low pass

        • Filter order:- 17

    The magnitude response of direct form FIR filter using floating point arithmetic and fixed point arithmetic were found to be same (figure 6.2).

    Next, using these filter co-effiecients direct form, transposed form and optimized direct form FIR filter structure were implemented using Active HDL and their performance with respect to area, timing and dynamic power consumption were analysed using Xilinx tool at RTL level and Cadence SOC encounter tool at Layout level.

    U37

    U38

    U39

    U40

    U41

    U42

    U43

    U44

    U45

    U46

    U47

    U48

    U49

    U50

    U53

    U52

    U51

    rst clk

    Elaborated settings done in FDA tool are shown in figure

    x(7:0)

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    U1

    rst cl k

    U2

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U3

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U24

    a(7: 0) c(13: 0)

    p>b(7: 0)

    multi 16

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    d(7:0) q(7:0)

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    rst cl k

    latch

    d(7:0) q(7:0)

    rst cl k

    latch

    U16

    U13

    below:-

    h0(7:0) p(7:0) p(7:0) p(7:0)

    h4(7:0)

    p(7:0)

    p(7:0) h7(7:0)

    h8(7:0) h9(7:0) p0(7:0) p1(7:0) p2(7:0) p3(7:0) p4(7:0) p5(7:0) p6(7:0) p7(7:0)

    b(31:0)

    U36

    a(13: 0)

    b(31:0)

    c(31:0)

    U31

    a(13: 0)

    b(31:0)

    c(31:0)

    U32

    b(31:0)

    a(13: 0)

    a(13: 0)

    c(31:0)

    U33

    b(31:0)

    a(13: 0)

    a(13: 0)

    U34

    U35

    b(31:0)

    multi 16

    multi 16

    U12

    U10

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U9

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U8

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U7

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U6

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U5

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U4

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    BUS7626(15:0)

    a(13: 0)

    U26

    89(32

    BUS86 :0) b(31:0)

    c(31:0)

    U27

    a(13: 0)

    b(31:0)

    c(31:0)

    U28

    b(31:0)

    a(13: 0)

    c(31:0)

    U29

    a(13: 0)

    b(31:0)

    c(31:0)

    U30

    a(13: 0)

    b(31:0)

    c(31:0)

    U11

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    U18

    a(13: 0)

    b(31:0)

    c(31:0)

    b(31:0)

    c(31:0)

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    a(13: 0)

    U23

    a(7: 0) c(13: 0)

    b(7: 0)

    multi 16

    a(13: 0)

    U22

    b(31:0)

    c(31:0)

    U14

    a(7: 0) c(13: 0)

    b(7: 0)

    U25

    b(31:0)

    c(31:0)

    U15

    a(7: 0) c(13: 0)

    b(7: 0)

    U21

    a(13: 0)

    b(31:0)

    c(31:0)

    U20

    b(31:0)

    c(31:0)

    multi 16

    U19

    a(13: 0)

    b(31:

    adder1431

    adder1431

    adder1431

    c(31:0)

    a(13: 0)

    adder1431

    b(31:0)

    c(31:0)

    adder1431

    c(31:0)

    adder1431

    adder1431

    adder1431

    adder1431

    adder1431

    adder1431

    adder1431

    adder1431

    adder1431

    adder1431

    adder1431

    adder1431

    adde

    Fig.6.1: FIR Equiripple filter specification on FDA tool

    From FDA tool, the filter co-efficients for direct form FIR filter structure were obtained. But these co-efficients were negative and in floating point format. In order to optimize the resources used (i.e. gates and hence area) at RTL ( Register Transfer level) , processing performance, system cost and ease of use; and since dynamic range of output is known, the floating point coefficients were converted into the fixed point coefficient by multiplying them with 1000 and taking the round off value of it. After that negative coefficients were converted into the positive coefficients by taking the absolute value of previous value.

    Fig 6.2 Magnitude Response of FIR Filter (a) Using floating point arithmetic

    (b) Using fixed point arithmetic

    Fig.6.3: Direct Form FIR filter using Active HDL

    The multiplier unit of MAC in direct form FIR filter is implemented using generic multiplier. The MCM block of MAC unit in transposed form is implemented using structure shown in figure 5.3(b). As per the concept of MCM approach derived from SCM approach, number of adders required to implement MAC unit of transposed structure will be high and thereby affecting area and power consumption of transposed form FIR filter structure.

    h9(7:0) p0(7:0) p1(7:0)

    U 5 1

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 1 7

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 1 6

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 2 1

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 1 4

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 1 2

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 1 1

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 1 0

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 9

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 8

    a(7:0) c(13:0)

    b(7:0)

    mu lti1 6

    U 7

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 6

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 5

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 4

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 1

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 2

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 3

    a(7:0) c(13:0) b(7:0)

    mu lti1 6

    U 2 4

    a(7:0) c(13:0)

    b(7:0)

    mu lti1 6

    To optimize the problems faced in implementating the MAC unit of transposed form digital filter structure, we had suggested a slight modification in transposed form FIR filter structure and with same approach DF FIR was also redesigned. In optimized transposed form structure concept of co- effiecient reuse is used to optimize further the resources used in comparison to structure shown in figure 5.3(b) and direct form structure FIR filter structure. In other words, it was observed that from18 filter co-efficents obtained from FDA tool of MATLAB, five co-efficents were repeated two or three times. So we had designed only five multipler unit based on structure shown in figure 5.3(b). The final direct form structure of FIR filter and transposed of direct form structure of FIR filter are shown in figure 6.3 and figure 6.4 respectively. The optimized DF form structure is shown in figure 6.5.

    x(7:0)

    h0(7:0) p(7:0) p(7:0) p(7:0) h4(7:0)

    p(7:0) p(7:0)

    h7(7:0)

    h8(7:0)

    p2(7:0) p3(7:0) p4(7:0)

    p5(7:0) p6(7:0) p7(7:0)

    U36 U31 U32 U33 U34 U35 U26 U27 U28

    U29 U30

    U18 U23 U15 U22

    U20 U19 U52

    U37 latcp2 U38 latcp2 U39 latcp2 U40 latcp2 U41 latcp2 U42 latcp2 U43 latcp2 U44 latcp2 U45 latcp2 U46 latcp2 U47 latcp2 U48 latcp2 U13 latcp2 U25 latcp2 U49 latcp2 U50 latcp2 U53 latcp2

    b(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) c

    adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431

    clk rst

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    clk

    rst

    a(13:0)

    clk

    rst

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    rst clk

    a(13:0)

    Fig.6.4: Transposed of Direct Form FIR filter using Active HDL

    x(7:0)

    U36 U31 U32 U33 U34 U35 U26 U27 U28 U29 U30 U18 U23 U15 U22 U20 U19 U52

    U37 latcp2 U38 latcp2 U39 latcp2 U40 latcp2 U41 latcp2 U42 latcp2 U43 latcp2 U44 latcp2 U45 latcp2 U46 latcp2 U47 latcp2 U48 latcp2 U13 latcp2 U25latcp2 U49 latcp2 U50 latcp2 U53 latcp2

    b(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) d(31:0) q(31:0) b(31:0)c(31:0) c(

    adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431 adder1431

    clk

    a(13:0)

    rst cl k

    a(13:0)

    rst cl k

    a(13:0)

    rst cl k

    a(13:0)

    rst cl k

    a(13:0)

    rst cl k

    a(13:0)

    rst cl k

    a(13:0)

    cl k

    rst

    a(13:0)

    rst cl k

    a(13:0)

    rst cl k

    U 4

    a(7:0) y(13:0)

    mu lh 4

    a(13:0)

    rst cl k

    a(13:0)

    cl k

    rst

    a(13:0)

    cl k

    rst

    a(13:0)

    rst cl k

    a(13:0)

    rst cl k

    U 1

    a(7:0) y(13:0)

    mu lh 3

    a(13:0)

    cl k

    rst

    U 2

    a(7:0) y(13:0)

    a(13:0)

    mu lh 2

    rst cl k

    U 2 4

    a(7:0) y(13:0)

    mu lh 1

    a(13:0)

    rst cl k

    U 3

    a(7:0) y(13:0)

    mu lh 0

    a(13:0)

    Fig.6.5: Optimized Direct Form FIR filter using Active HDL

    Now, on comparing direct form FIR structure and transposed form FIR structure, it was observed that latch size in transposed form at each stage is increasing and hence the adder size is also increasing. So, this may increase area overhead and hence power consumption of design and also may add to the latency of the circuit. The same was observed after the implementation of transposed structure.

    Compilation Report of direct form (filter 1), transposed form (filter 2) and optimized direct form (MCM with partial product sharing) (filter 3) of FIR filter are given below:-

    Table 7.3.1 Compilation Summary of Filter 1, Filter 2 & Filter 3

    Filter 0

    Filter 1

    Filter 2

    Technology

    gscl45nm

    gscl45nm

    gscl45nm

    Global Operating Voltage

    1.1 V

    1.1 V

    1.1 V

    Combinational

    area

    27431.992899

    sq.nm

    28421.277155

    sq.nm

    14868.831567

    sq.nm

    Noncombinational area

    1404.145630

    sq.nm

    5616.582520

    sq.nm

    5620.336920

    sq.nm

    Total cell area

    28836.138528

    sq.nm

    34037.859675

    sq.nm

    20489.168487

    sq.nm

    Cell Internal Power

    3.5082 mW

    7.7709 mW

    1.4805 mW

    Net Switching

    Power

    2.3820 mW

    5.0423 mW

    864.7971 uW

    Total Dynamic Power

    5.8902 mW

    12.8132 mW

    2.3453 mW

    Cell Leakage Power

    172.0155 uW

    225.0586 uW

    152.1185 uW

    Worst Case Propagation delay (RTL

    Xilinx report)

    21.101 nsec

    11.129 nsec

    6.619 nsec

  9. CONCLUSION

In this paper, implementation of low power, speed efficient and area efficient FIR filters using filter co-efficient reuse concept and MCM technique based on partial product sharing has been considered wherein multiplication operations are replaced by shift-and-add operation. The experimental results showed that area has been reduced by 28.946 %, dynamic power consumption reduced by 60.183% and worst propagation delay by 68.232 % in optimized direct form FIR filter structure in comparison to direct form and transposed form digital FIR filter structure. This indicated that our proposed modification in direct form FIR filter structure and use of MCM technique based on partial product sharing and

use of concept co-efficient sharing leads to an area efficient, low power and high speed digital FIR Filter structure for DSP systems.

Future research includes improvising the performance of the FIR system by implementing if possible adder unit using fast adders and a full characterization of each design option at layout level.

REFERENCES

  1. Rabiner, Lawrence R., and Gold, Bernard, 1975, Theory and Application of Digital Signal Processing (Englewood Cliffs, New Jersey: Prentice-Hall, Inc.) ISBN 0-13-914101-4.

  2. A. E. Cetin, O.N. Gerek, Y. Yardimci, "Equiripple FIR filter design by the FFT algorithm," IEEE Signal Processing Magazine, pp. 60-64, March 1997.

  3. K. Johansson, O. Gustafsson, and L. Wanhammar, "Multiple Constant Multiplication for Digit-Serial Implementation of Low Power FIR Filters," WSEAS Transactions on Circuits and Systems, vol. 5, no. 7, pp. 1001-1008,2006.

  4. Y. Voronenko and M. Piischel, "Multiplierless Multiple Constant Multiplication," ACM Transactions on Algorithms, vol. 3, no. 2, 2007.

  5. H. Nguyen and A. Chatterjee, "Number-Splitting with Shift-and-Add Decomposition for Power and Hardware Optimization in Linear DSP Synthesis," IEEE Trans. on VLSI, vol. 8, no. 4, pp. 419–424, 2000.

  6. L. Aksoy, C. Lazzari, E. Costa, P. Flores, and J. Monteiro, Efficient shift-adds design of digit-serial multiple constant multiplications, in Proc. Great Lakes Symp. VLSI, 2011, pp. 6166.

  7. A. Dempster and M. Macleod, "Use of Minimum-Adder Multiplier Blocks in FIR Digital Filters," IEEE TCAS II, vol. 42, no. 9, pp. 569- 577, 1995.

  8. Ahmed Shahein, Student Member, IEEE, Qiang Zhang, Niklas Lotze, and Yiannos Manoli, Senior Member, IEEE A Novel Hybrid Monotonic Local Search Algorithm for FIR Filter Coefficients Optimization, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: REGULAR PAPERS, VOL. 59, NO. 3, MARCH 2012.

  9. Steven Smith, The Scientist and Engineers Guide to Digital Signal processing, Second Edition, Chapter 28, pp. 514-519, California Technical Publishing, San Diego, California.

  10. N.Sankarayya, K.Roy and D.Bhattacharya, Optimizing Computations in Transposed Direct Form Realization of Floating-Point LTI FIR Systems, Computer-Aided Design, 1997, Digest of TechnicalPapers., 1997 IEEE/ACM International Conference.

  11. Levent Aksoy , Cristiano Lazzari, Eduardo Costa, Paulo Flores and Jose Monteiro, Optimization of Area in Digit-Serial Multiple Constant Multiplications at Gate-Level, Circuits and Systems (ISCAS), 2011 IEEE International Symposium, Rio de Janeiro.

  12. A.Nagoor Kani, Digital Signal Proessing,Chapter 8, pp.8.1-8.16, Second Edition, Tata McGrawHill.

Leave a Reply