Low Power 40-bit SQRT Carry Select Adder

DOI : 10.17577/IJERTV1IS10422

Download Full-Text PDF Cite this Publication

Text Only Version

Low Power 40-bit SQRT Carry Select Adder

Debarshi Datta1, Partha Mitra2, Avisek Sen3

1,2,3Assistant Professor of SDET Brainware Group of Institutions

Abstract

Digital processor requires high speed and low power Multiplier-Accumulator (MAC Unit). Adder circuit is the main building block in DSP processor. However, Digital adders suffer with the problem of carry propagation delay. To alleviate this problem Carry Select Adder (CSLA) are used in computational unit. There is scope to reduce the power consumption in the regular CSLA. A simple gate level modification is required of the regular CSLA to reduce the power. This paper proposes modified 40-bit square-root CSLA (SQRT CSLA) architecture. Both the regular and modified 40-bit CSLA are designed with TSMC 0.13-

µm CMOS process technology. The proposed design has reduced area and power as compared with the regular SQRT CSLA with only slightly increases in the delay.

Keywords CSLA, DSP processor, Low power, MAC Unit, Power Delay Product, VLSI

  1. Introduction

    Due to the rapid growth of portable electronic component the low power arithmetic circuits have become very important in VLSI industry. Multiplier- Accumulator (MAC) unit is the main building block in DSP processor. Full Adder is a part of the MAC unit can significantly affect the efficiency of whole system. Hence the reduction of power consumption of Full Adder circuit is necessary for low power application. Carry Select Adder are used for high speed application by reducing propagation delay. All manuscripts must be in English. These guidelines include complete descriptions of the fonts, spacing, and related information for producing your proceedings manuscripts.

    The basic operation Carry Select Adder (CSLA) is parallel computation. CSLA generates many carriers and partial sum [1]. The final sum and carry are selected by multiplexers [mux].

    Multiple pairs of Ripple Cary Adders (RCA) are used in CSLA structure. Hence, the CSLA is not area efficient. In this paper, we proposed a modified CSLA architecture.

    The proposed method use Binary to Excess-1 converter (BEC) instead of RCA with Cin=1 in the regular CSLA. The main goal of this BEC logic is to use lesser number of logic gate than the n-bit Full Adder. So that, the modified CSLA architecture is lower area and power consumption [2]-[4]. The details of the BEC logic are discussed in Section III.

    This paper is organized as follows. Section II presents the delay evaluation methodology of basic adder block. The structure and the function of the BEC logic come from the Section III. The SQRT CSLA has been chosen for comparison with the proposed design as is has more balanced delay and need lower power [5]-[6]. The delay evaluation methodologies of the regular and modified SQRT CSLA are presented in Sectioned IV and V, respectively. Section VI reviews the results obtained from the simulations and Section VII concludes this work.

    Fig. 1. Delay evaluation of an XOR ga t e .

    3. BEC Logic Gate

    TABLEI::DELAY AND AREA COUNT OF THE BASIC BLOCKS

    OF CSLA

    Adder blocks

    Delay

    Area

    XOR

    3

    5

    2:1 Mux

    3

    4

    Half Adder

    3

    6

    Full Adder

    6

    13

    Fig. 2. 4-b BEC.

  2. Delay and Area Evaluation Methodology of the Basic Adder Block

An XOR gate consists of basic gates like AND, OR, and Inverter (AOI) shown in Fig.1. The gates are performing parallel operation between the dotted line and the numeric representation of each gate indicates the delay contributed by that gate. For the delay and area evaluation methodology all the gates having equal to 1 unit delay and 1 unit area. The maximum delay can be finding out by adding gates of a longest path of a logic block. Based on this approach, the CSLA blocks of 2:1 mux, Half Adder (HA), and Full Adder (FA) are evaluated and listed in Table I.

The proposed method uses BEC logic. The regular CSLA structure consists of two Ripple Carry Adders (RCA). One of RCA use with initial carry Cin=0 and with carry Cin=1. BEC is use instead of RCA with Cin=1 in order to reduce and power consumption of the regular CSLA. To replace the n-bit RCA, an n+1 bit BEC is required. The structure of a 4-bit BEC is shown in Fig. 2 and Table II shows its corresponding Boolean expression.

From Fig. 3 shows the 4-bit BEC and a 8:4 multiplexer perform the basic function of CSLA. One input of the mux is direct input (B3, B2, B1, and B0) and another input of the mux is the BEC output. This produces the two possible partial results in parallel and the mux is used to select either the BEC output or the direct inputs according to the control signal Cin. The Boolean expressions of the 4-bit BEC are shown below (note the functional symbols ~ NOT, & AND, ^ XOR).

X0 = ~ B0, X1 = B0 ^ B1,

X2 = B2 ^ (B0 & B1), X3 = B3 ^ (B0 & B1 & B2).

Fig. 3. 4-b BEC with 8:4 mux

TABLE II: FUNCTION TABLE OF THE 4-BIT BEC

B[3:0]

X[3:0]

0000

0001

0001

0010

0010

0011

..

..

..

..

1110

1111

1111

0000

Mux = 12(3 * 4).

  1. Delay and Area Evaluation Methodology of Regular 16-bit SQRT CSLA

    The 16-b regular SQRT CSLA structure is shown in Fig. 4. It has five groups of different size RCA. Fig. 5 shows the delay and area evaluation. The numerals within [] specify the delay values. The steps leading to the evaluation are as follows.

    1. The group2 [see Fig. 5(a)] requires two sets of 2-bit RCA. Delay calculation on considering the Table I, the arrival time of selection input c1[time(t) = 7] of 6:3 mux is earlier than s3[ t

      = 8] and later than s2[t=6]. Thus, sum3[t = 11] is summation of s3 and mux[ t = 3] and sum2[t = 10] is summation of c1 and mux.

    2. The delay of group3 to group5 is determined, respectectively as follows:

      {c6, sum [6:4]} = c3 [t = 10] + mux

      {c10, sum [10: 7]} = c6 [t = 13] + mux

      {count, sum[15 : 11]} = c10[t = 16] + mux

    3. The one set of 2-bit RCA in group2 has 2 FA for Cin = 1 and the other set has 1 HA for Cin

      = 0. As if the area consideration of Table I, the total number of gate can be calculated as follows:

      Gate count = 57 (FA + HA + Mux) FA = 39 (3 * 13),

      HA = 6 (1 * 6),

      Fig. 4. Regular 16-bit SQRT CSLA

      Fig. 5. Delay and area evaluation of regular SQRT CSLA: (a) group2, (b) group3, (c) group4, and (d) group5. F is Full Adder.

    4. Similarly, the maximum delay and area can be calculated of the other groups in the

      regular SQRT CSLA are evaluated in Table III.

      TABLE III: DELAY AND AREA COUNT OF REGULAR SQRT CSLA GROUPS

      Group

      Delay

      Gate Count

      Group2

      11

      57

      Group3

      13

      87

      Group4

      16

      117

      Group5

      19

      147

  2. Delay and Area Evalution Methodology of Modified 16-bit SQRT CSLA

    The Modified 16-bit SQRT CSLA is shown in Fig.

  3. RCA with Cin = 1 is replaced by BEC logic gates. The evaluation procedures are as follows:

    1. The group2 [see Fig. 7(a)] has one 2-bit RCA which has 1 FA and 1 HA for Cin = 0. A 3-bit BEC is used in place of another 2-bit RCA with Cin = 1.The 3-bit RCA adds one to the output from 2-bit RCA. Delay consideration as on Table I, the arrival time of selection input c1[time (t) = 7] of 6:3 mux is earlier than the s3[t=9] and c3[t = 10] and later than the s2[t = 4]. Thus, the sum3 and final c3 (output from mux are depending on s3 and mux and partial c3 (input to mux) and mux, respectively. The sum2 depends on c1 and mux.

    2. The area count of group2 is calculated as follows:

      Gate count = 43 (FA + HA + Mux + BEC) FA = 13 (1 * 13), HA = 6 (1 * 6),

      AND = 1, NOT = 1

      XOR = 10 (2 * 5), Mux = 12 (3 * 4)

      Fig. 6. Modified 16-b SQRT CSLA. The parallel RCA with Cin=1 is replaced with BEC.

      Fig. 7. Delay and area evaluation of modified SQRT CSLA: (a) group2, (b) group3, (c) group4, and (d) group5. H is Half Adder.

    3. The maximum delay and the area of the modified SQRT CSLA are evaluated in Table IV.

    TABLE IV: DELAY AND AREA COUNT OF MODIFIED SQRT CSLA

    (PDP) by 15.6%. The adder circuit is operated at 125MHz and supply voltage 1.5V.

  4. Conclusion

In this paper, a modified 40-bit SQRT CSLA has been proposed for data path circuit (MAC unit) for low power DSP application. Table V shows that modified CSLA has reduced the power-delay product (PDP) as compare with regular CSLA with slightly increase in delay. Therefore these modified 40-bit SQRT CSLA architecture can be used for low power high speed DSP processor.

Group

Delay

Gate Count

Group2

13

43

Group3

16

61

Group4

19

84

Group5

22

107

Acknowledgment

Comparing Tables III and IV, it is clear that proposed modified SQRT CSLA saves 113 gate counts than regular SQRT CSLA, with only 11 increases in gate delays.

6. Simulation Results

The proposed 40-bit SQRT CSLA has been developed using TSMC 0.13-µm CMOS process technology.

TABLE V: COMPARISON OF THE REGULAR AND MODIFIED 40-BIT SQRT CSLA ARCHITECTURE

Type of Adders

Supply Voltage (V)

Delay (ns)

Switching Power (µw)

Power- Delay Product (10-15J)

Regular CSLA

1.5

5.986

1283.7

7684.2

Modified

CSLA

1.5

6.316

1057.5

6488.8

The Table V shows power consumption of the proposed architecture with slight increase in propagation delay. The modified 40-bit SQRT CSLA has an improvement in the Power-Delay Product

The authors would like to thank Advanced VLSI Design Lab, IIT Kharagpur for their co-operation and support.

References

  1. O. J. Bedrij, Carry-select adder, IRE Trans.

    Electron. Comput., pp.340344, 1962

  2. B. Ramkumar, H. M. Kittur, and P. M. Kannan, ASIC implementation of modified faster carry save adder, Eur. J. Sci. Res., vol. 42, no. 1, pp.53 58, 2010

  3. T. Y. Ceiang and M. J. Hsiao, Carry-select adder using single ripple carry adder, Electron. Lett., vol. 34, no. 22, pp. 21012103, Oct. 1998

  4. Y. Kim and L.-S. Kim, 64-bit carry-select adder with reduced area, Electron. Lett., vol. 37, no. 10, pp. 614615, May 2001.

  5. J. M. Rabaey, Digtal Integrated CircuitsA Design Perspective. Upper Saddle River, NJ: Prentice- Hall, 2001.

  6. Cadence, Encounter user guide, Version 6.2.4, March 2008

Leave a Reply