 Open Access
 Total Downloads : 1271
 Authors : R.Chaitanyakumar, P.Sunitha
 Paper ID : IJERTV1IS8636
 Volume & Issue : Volume 01, Issue 08 (October 2012)
 Published (First Online): 29102012
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
A Novel Approach For Designing A Low Power Parallel Prefix Adders
R.Chaitanyakumar M Tech student,
Pragati Engineering College, Surampalem (A.P, IND).
Abstract
This paper presents an implementation of prefix adders and compare with the Ripple carry adder (RCA) and Carry skip adder (CSA) in terms of power and speed. The binary adder is the critical element in most digital circuit designs including digital signal processors (DSP) and microprocessor data path units. As such, extensive research continues to be focused on improving the power delay performance of the adder. In VLSI implementations, parallelprefix adders are known to have the best performance.
Binary adders are one of the most essential logic elements within a digital system. In addition, binary adders are also helpful in units other than Arithmetic Logic Units (ALU), such as multipliers, dividers and memory addressing. Therefore, binary addition is essential that any improvement in binary addition can result in a performance boost for any computing system and, hence, help improve the performance of the entire system. In this project XilinxISE tool is used for simulation, logical verification, and further synthesizing.

Introduction
To humans, decimal numbers are easy to comprehend and implement for performing arithmetic. However, in digital systems, such as a microprocessor, DSP (Digital Signal Processor) or ASIC (ApplicationSpecific Integrated Circuit), binary numbers are more pragmatic for a given computation. This occurs because binary values are optimally efficient at representing many values.
Binary adders are one of the most essential logic elements within a digital system. In addition, binary adders are also helpful in units other than Arithmetic Logic Units (ALU), such as multipliers, dividers and memory addressing. Therefore, binary addition is essential that any improvement in binary addition can
P.Sunitha
Assistant Professor, Dept.of ECE Pragati Engineering College, Surampalem (A.P, IND).
Result in a performance boost for any computing system and, hence, help improve the performance of the entire system.
The major problem for binary addition is the carry chain. As the width of the input operand increases, the length of the carry chain increases. Figure 1.1 demonstrates an example of an 8 bit binary add operation and how the carry chain is affected. This example shows that the worst case occurs when the carry travels the longest possible path, from the least significant bit (LSB) to the most significant bit (MSB). In order to improve the performance of carrypropagate adders, it is possible to accelerate the carry chain, but not eliminate it. Consequently, most digital designers often resort to building faster adders when optimizing computer architecture, because they tend to set the critical path for most computations.
Figure 1. Binary Adder Example
The binary adder is the critical element in most digital circuit designs including digital signal processors (DSP) and microprocessor data path units. As such, extensive research continues to be focused on improving the power delay performance of the adder. In VLSI implementations, parallelprefix adders are known to have the best performance. Reconfigurable logic such as Field Programmable Gate Arrays (FPGAs) has been gaining in popularity in recent years because it offers improved performance in terms of speed and power over DSPbased and microprocessor based solutions for many practical designs involving mobile DSP and telecommunications applications and a significant reduction in development time and cost over Application Specific Integrated Circuit (ASIC) designs.
The power advantage is especially important with the growing popularity of mobile and portable electronics, which make extensive use of DSP functions. However, because of the structure of the
configurable logic and routing resources in FPGAs, parallelprefix adders will have a different performance than VLSI implementations. In particular, most modern FPGAs employ a fastcarry chain which optimizes the carry path for the simple Ripple Carry Adder (RCA). In this paper, the practical issues involved in designing and implementing treebased adders on FPGAs are described. Several treebased adder structures are implemented and characterized on a FPGA and compared with the Ripple Carry Adder (RCA) and the Carry Skip Adder (CSA). Finally, some conclusions and suggestions for improving FPGA designs to enable better treebased adder performance are given.

Research Contributions
The implementations that have been developed in this dissertation help to improve the design of parallel prefix adders and their associated computing architectures. This has the potential of impacting many application specific and general purpose computer architectures. Consequently, this work can impact the designs of many computing systems, as well as impacting many areas of engineers and science. In this paper, the practical issues involved in designing and implementing treebased adders on FPGAs are described. Several treebased adder structures are implemented and characterized on a FPGA and compared with the Ripple Carry Adder (RCA) and the Carry Skip Adder (CSA). Finally, some conclusions and suggestions for improving FPGA designs to enable better treebased adder performance are given

ParallelPrefix Structures
To resolve the delay of carrylook ahead adders, the scheme of multilevellook ahead adders or parallel prefix adders can be employed. The idea is to compute small group of intermediate prefixes and then find large group prefixes, until all the carry bits are computed. These adders have tree structures within a carry computing stage similar to the carry propagate adder. However, the other two stages for these adders are called precomputation and postcomputation stages. In precomputation stage, each bit computes its carry generate/propagate and a temporary sum. In the prefix stage, the group carry generate/propagate signals are computed to form the carry chain and provide the carry in for the adder below.
Gi:k = Gi:j + Pi:j . Gj1:k Pi:k = Pi:j . Pj1:k
In the postcomputation stage, the sum and carry out are finally produced. The carryout can be omitted if only a sum needs to be produced.
si = ti ^ Gi:1
cout = gn1 + pn1 _ Gn2:1
where Gi:1 = ci with the assumption g1 = cin.
All parallelprefix structures can be implemented with the equations above; however, Equation can be interpreted in various ways, which leads to different types of parallelprefix trees.
Parallelprefix adders, also known as carrytree adders, precalculate the propagate and generate signals. These signals are variously combined using the fundamental carry operator (fco) cells.
(gL, pL) (gR, pR) = (gL + pLgR, pL pR)
Due to associative law of the fco, these operators can be combined in different ways to form various adder structures. For, suppose the 4bit carrylook ahead generator is given by.
c4 = (g4, p4) [(g3, p3) [(g2, p2) (g1, p1)] ]
A simple rearrangement of the order of operations allows parallel operation, resulting in a more efficient tree structure for this 4bit example
c4 = [(g4, p4) (g3, p3)] [(g2, p2 ) (g1, p1)]
It is readily apparent that a key advantage of the tree structured adder is that the critical path due to the carry delay is on the order of log2N for an Nbit wide adder. The arrangement of the prefix network gives rise to various families of adders. For this study, the focus is onthe KoggeStone adder, known for having minimal logic depth and fanout (see Figure 3.1). Here we designate BC as the black cell which generates the ordered pair in equation; the gray cell (GC) generates the left signal only. The interconnect area is known to be high, but for an FPGA with large routing overhead to begin with, this is not as important as in a VLSI implementation. The regularity of the KoggeStone prefix network has built in redundancy which has implications for faulttolerant designs.

KoggeStone Prefix Tree
KoggeStone prefix tree is among the type of prefix trees that use the fewest logic levels. A 16bit example is shown in Figure 3.1. In fact, KoggeStone is a member of Knowles prefix tree. The 16bit prefix tree can be viewed as Knowles [1, 1, 1, 1]. The numbers in the brackets represent the maximum branch fanout at each logic level. The maximum fanout is 2 in all logic levels for all width KoggeStone prefix trees.
The key of building a prefix tree is how to implement Equation according to the specific features of that type of prefix tree and apply the rules described in the previous section. Gray cells are inserted similar to black cells except that the gray cells final output carry outs instead of intermediate G/P group. The
reason of starting with KoggeStone prefix tree is that it is the easiest to build in terms of using a program concept. The example in Figure 3.1 is 16bit (a power of 2) prefix tree. It is not difficult to extend the structure to any width if the basics are strictly followed.
Figure 3.1: 16bit KoggeStone Prefix Tree
For the KoggeStone prefix tree, at the logic level 1, the inputs span is 1 bit, example group (4:3) take the inputs at bit 4 and bit 3). Group (4:3) will be taken as inputs and combined with group (6:5) to generate group (6:3) at logic level 2. Group (6:3) will be taken as inputs and combined with group (10:7) to generate group (10:3) at logic level 3, and so on so forth.
Table I: Subset of (g, p) Relations Used for Testing

Sparse KoggeStone adder
The sparse KoggeStone adder, shown in Figure 3.2.This hybrid design completes the summation process with a 4 bit RCA allowing the carry prefix network to be simplified.

Spanning Tree CarryLook ahead Adder
Another carrytree adder is the spanning tree carrylook
ahead (CLA) adder is also examined. Like the sparse KoggeStone adder, this design terminates with a four bit Ripple Carry Adder (RCA). As the FPGA uses a fast carrychain for the Ripple Carry Adder (RCA), it is interesting to compare the performance of this adder with the sparse KoggeStone and regular KoggeStone adder.
Figure 3.3: 16bit Spanning Tree Carry Look ahead Adder.


Experimental Results
The Parallel Prefix adders of koggiestone adder, sparse koggiestone and spanning tree carry look ahead adders are simulated and synthesized verilog using XilinxISE tools, and we found that, these parallel prefix adders are consuming less power and having high speed functionality as compared with the Ripple carry adder (RCA) and Carry Skip adder (CSA).The Experimental results of corresponding adders are given below.
Figure 4.1 Design summary of KoggeStone Adder
The design summary of Koggestone adders total number of 4input LUTs, logic Utilization are given in
4.1 figure.
Figure 4.2 Design summary of Sparse KoggeStone adder.
The design summary of Sparse Koggestone adders total number of 4input LUTs, logic Utilization are given in 4.2 figure.
Figure 4.3 Design summary of Spanning Tree Carry look ahead Adder.
The design summary of Spanning tree adders total number of 4input LUTs, logic Utilization are given in

figure.
Figure 4.4 Design summary of Carry Skip Adder. The design summary of Carry Skip adders total number of 4input LUTs, logic Utilization are given in

figure.
Figure 4.5 Design summary of Ripple carry adder.
The design summary of Ripple Carry adders total
number of 4input LUTs, logic Utilization are given in

figure.
The Experimental waveforms of Parallel Prefix adders are given below

KoggeStone Adder Waveform.
Figure 4.6 Waveform of KoggeStone adder

Spanning Tree Adder wave form.
Figure 4.7 Waveform of Spanning Tree adder

Sparse KoggeStone Adder
Figure 4.6 Waveform of Sparse KoggeStone adder


Conclusions
In this paper we have developed an efficient Parallel Prefix Adders to achieve less power consumption and better speed performance when compared to Ripple Carry Adder and Carry Skip Adder.
The Experimental Power Consumption Results are given below.
Ripple Carry Adder consumes 25mw. Carry Skip Adder consumes 27mw. KoggeStone Adder consumes 20mw.
Sparse KoggeStone Adder consumes 17mw. Spanning Tree Adder consumes 16mw.
Based on these results, the Parallel Prefix Adders consume less power when compared to Ripple Carry
Adder and Carry Skip Adder.
References

N. H. E. Weste and D. Harris, CMOS VLSI Design,
th
4 edition, PearsonAddisonWesley, 2011.

R. P. Brent and H. T. Kung, A regular layout for parallel adders, IEEE Trans. Comput., vol. C 31, pp.

D. Harris, A Taxonomy of Parallel Prefix Networks, in Proc. 37th Asilomar Conf. Signals Systems and Computers, pp. 22137, 2003.

P. M. Kogge and H. S. Stone, A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations, IEEE Trans. on Computers, Vol. C22, No 8, August 1973.

P. Ndai, S. Lu, D. Somesekhar, and K. Roy, Fine Grained Redundancy in Adders, Int. Symp. On Quality Electronic Design, pp. 317321, March 2007.

T. Lynch and E. E. Swartzlander, A Spanning Tree Carry Lookahead Adder, IEEE Trans. on Computers, vol. 41, no. 8, pp. 931939, Aug. 1992.

D. Gizopoulos, M . Psarakis, A . Paschalis, a n d Y . Zorian, Easily Testable Cellular Carry Lookahead Adders, Journal of Electronic Testing: Theory and Applications 19, 285298, 2003.

S. Xing and W. W. H. Yu, FPGA Adders: Performance Evaluation and Optimal Design, IEEE Design & Test of Computers, vol. 15, no. 1, pp. 2429, Jan. 1998.

M. BevÃ¡ and P. tukjunger, FixedPoint Arithmetic in FPGA, Acta Polytechnica, vol. 45, no. 2, pp. 6772, 2005

K. Vitoroulis and A. J. AlKhalili, Performance of Parallel Prefix Adders Implemented with FPGA technology, IEEE Northeast Workshop on Circuits and Systems, pp. 498501, Aug. 2007.