 Open Access
 Total Downloads : 485
 Authors : Sitanshu Satpathy, Suhail Ayub, Manan Soni, Marimuthu R
 Paper ID : IJERTV3IS100583
 Volume & Issue : Volume 03, Issue 10 (October 2014)
 Published (First Online): 21102014
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Energy Efficient Implementation of Modified 4:2 Compressor in High Speed Multipliers
Sitanshu Satpathy Suhail Ayub Manan Soni Marimuthu R
Electronics and Instrumentation Electronics and Instrumentation Electronics and Instrumentation (Asst. Prof. Senior) Vellore Institute of Technology Vellore Institute of Technology Vellore Institute of Technology Electronics and Instrumentation
Vellore, India Vellore, India
Vellore, India Vellore Institute of Technology Vellore, India
Abstract This paper sheds light upon the usage of 4:2 compressor in 4×4 and 8×8 Wallace tree multipliers. Usage of compressors in multipliers improves the efficiency and reduces the processing time. A modified 4:2 compressor design is discussed and its performance is compared with the conventional 4:2 compressor. The modified 4:2 compressor uses a combination of XORXNOR and MUX* gates. The concept of these compressors for improving the performance of the multiplier is done on transistor level. The performance of these different designs is compared and the observation is that the usage of compressor makes the process more energy efficient and faster as compared to the traditional methods. In case of 4 bit multipliers usage of modified 4:2 compressors reduced the delay by 38.2% and propagation delay product by 34.7%. Similarly usage of 4:2 compressors in 8 bit multipliers reduced the delay by 42% and propagation delay product by 47%.
Keywords adders,compressor,propagation delay, power consumption,Wallace tree multiplier

INTRODUCTION
Multiplication has become an integral part of many processing activities. It has become the most indispensible operation of the modern processing equipment [1,2,3,4,5,6,7,8,9]. Multiplication being a complicated process is time consuming and requires higher energy requirements also[1,10,11,12,13]. So any improvement in the multiplication process will in turn improve the speed of the entire processing activity, reduce the power consumption and also leakage power [1,14]. Technology scaling is one traditional method that had been used to improve the energy requirements, but this method has the drawback that it does not have the ability to sustain constant power density.
The multiplication process can be divided into three stages namely the generation of partial product, reduction of the partial product and the final addition [1,2,3,4,5,6,7,8,15,16,17,18]. The first stage employs AND gates for the generation of partial products followed by the reduction of these products. This is the area where major manipulations are done and the compressors are used to improve the performance.
The Wallace tree structure consists of a combination of full adders and half adders in the partial product generation stage [1,5,6,15,17,18]. This structure is modified using the compressor architecture which minimizes the use of
components. This usage of compressor has shown a commendable improvement in the multiplication process on the whole [1,3,4,5,6,7,8,9,19,20,21]. The usage of a final adder for the final addition stage improves the energy efficiency of the multiplier.

WALLACE TREE MULTIPLIER
Wallace tree method is used to design an efficient multiplication structure using half adders and full adders [1,5,6,14,16,17]. It gives an efficient hardware implementation to the digital circuit multiplying two numbers. Basically it involves the reduction of partial products using two levels of half adders and full adders. While making a Wallace tree design one must be aware that one particular half adder or full adder can take in wires of the same weight. In this paper we have used Wallace tree 4 bit and 8 bit multiplier.

4:2 COMPRESSOR
The 4:2 compressor design reduces the complexity of the Wallace tree structure for multiplication [1]. A 4:2 compressor has four inputs and two outputs i.e. the sum and the carry apart from a carry in (cin) and carry out (cout) [5,6,8,9,17,20]. Cin is the cout of the previous compressor and cout becomes the cin of the next compressor. The block structure for a 4:2 compressor can be realized as shown in fig 1.
Fig.1 4:2 compressor
The standard governing equation for a 4:2 compressor can be considered as follows:
a0+a1+a2+a3+a4+cin = sum+2*( carry + cout)
The 4:2 compressor design has been implemented using following methods:

conventional method using two full adders as shown in fig 2.

using XORXNOR and modified MUX i.e. MUX* as shown in fig 3,4,5 [9].
Out of these methods the usage of 2 full adders is considered to be the standard design for a 4:2 compressor.
Fig 2 shows the conventional 4:2 compressor using two full adders. The first full adder has three inputs a0, a1 ,a2 and its sum becomes one of the three inputs of the second full adder with cin and a4 being the other two inputs. Outputs of the second full adder become the outputs sum and carry of the 4:2 compressor and carry of the first full adder becomes the cout of the compressor.
Fig.2 Conventional 4:2 compressor
Fig. 3 Modified MUX
Fig. 4 XORXNOR gate
Fig 3 and fig 4 show the methods used for designing of MUX* and XORXNOR gates. MUX* has one output more than the normal MUX which is the complement of the usual output of the conventional MUX gate. XOR XNOR gives the output of XOR and XNOR operations in one block itself thus reducing the delay. The use of more number of transistors trades off the delay with area and power consumption [9].
Fig 5 shows the modified compressor design using the proposed XORXNOR and MUX* method.
In the modified design [9] which had 72 transistors further optimization was done by removing unnecessary not gates and reducedthe number of transistors to 66, thus further reducing the delay.
Fig. 5 Modified 4:2 compressor
The outputs of the modified architecture are given by the following equations [9]:
sum= (a0a1).(a2a3) +(a0a1).(a2a3).cin
+ (a0a1).(a2a3)+(a0a1).(a2a3).cin
cout= (a0a1).a2+(a0a1).a0
carry= (a0a1a2a3).cin+ (a0a12a3).a3
Fig6 shows the Register Transfer Level schematic of 4:2 compressor created using Xilinx ISE RTL schematic.
Fig. 6 RTL schematic of 4:2 compressor
Fig. 7 Showing the transistor level schematic of a full adder


IMPLEMENTATION OF COMPRESSOR IN MULTIPLIERS
First all the eight inputs are fed as input to the AND gates which form sixteen products as shown in the fig 8 and form a tree like structure. Then these inputs are further fed to half adders, full adders and compressors to reduce the partial products [22].
Fig. 8 Reduction of partial products
In case of conventional 4 bit Wallace tree multipliers only adders are used which can take a maximum of three inputs whereas a 4:2 compressor which takes 4 inputs at a time reduces the usage of adders comprehensively.
Same methodology is used to reduce 8 bit Wallace multipliers with the cout of the compressors becoming the cin of the corresponding compressors.

SIMULATION AND RESULTS
The functionality of the compressors was verified using Xilinx Design Suite 12.3 in gate level using Verilog module and Verilog test bench. Apart from this, transistor level analysis was done using cadence virtuoso from which we could determine the propagation delay, propagation delay product
and the power consumption.Propagation delay product is the product of the time delay and power consumed.
Table 1 shows the performance comparison of the modified and the conventional 4:2 compressor used.Table 2 shows the propagation delay of the various multipliers simulated using data flow Verilog modeling in Xilinx ISE Design Suite 12.3. Transistor level analysis of 4 bit Wallace tree and 8 bit Wallace tree multipliers with and without compressor was done in Cadence Virtuoso. Length and Width of transistors used was 250nm and 1.5um respectively with an input voltage i.e. Vdd of 5v. Table 3 shows the results of the transistor level analysis.
Table 1Performance comparison between the conventional and modified 4:2 compressor.
4:2 compressor 
Propagation delay (ns) 
Power(mw) 
PDP(propagati on delay product) (10 11 joulesec) 
Conventional 
0.205 
4.04 
0.08 
Modified 
0.008 
6.7 
0.005 
S.no 
Multiplier 
Delay(ns) 
1 
4×4 wallace tree 
14.306 
2 
4×4 using 2 full adders 
13.987 
4 
4×4 using modified MUX 
13.563 
5 
8×8 wallace tree 
30.479 
6 
8×8 using conventional 4:2 
23.648 
7 
8×8 using modified 4:2 
22.976 
Table 2Propagation Delay found using XilinxISE Design Suite 12.3 for various multipliers
Wallace tree Multiplier type 
Propagatio n delay (ns) 
Power(mw) 
PDP(propagatio n delay product) (1011 joulesec) 
4×4 
0.4128 
27.654 
1.141 
4×4 using conventional 4:2 compressor 
0.3531 
22.029 
0.778 
4×4 using modified 4:2 compressor 
0.255 
25.400 
0.745 
8×8 
0.5546 
159.5 
8.845 
8×8 using conventional 4:2 compressor 
0.4451 
123.57 
5.500 
8×8 using modified 4:2 compressor 
0.321 
141 
4.713 
Table 3Cadence virtuoso results
Transistor level analysis in table 3 shows that in case of 4 bit multipliers there is a reduction in delay by 38.2% using the modified 4:2 compressor an improvement from conventional 4:2 compressor which reduced the delay by

%. There was also a 34.7% decrease in propagation delay product in case of modified design whereas there was 31.8% decrease in conventional one.
Similarly in 8 bit multiplier the use of modified 4:2 compressor reduced the delay by 42% and propagation delay product by 47% whereas the conventional 4:2 compressor reduced the delay and propagation delay product by 20% and 38% respectively.
Thus these results prove that the modified 4:2 compressor deign has a better performance than the conventional design.
CONCLUSION
On our analysis of the circuits we observed that by modifying the Wallace tree multiplier using compressors we get a smaller critical path resulting in higher processing speed. The use of 4:2 compressors results in lesser requirement of adders and hence reduces the complexity of the circuit and minimizes the time delay. The proposed modified design of 4:2 compressor reduces the propagation delay further and thus makes the Wallace tree multiplier faster. Though there is a slight increase in the power consumption but the subsequent decrease in power delay product nullifies the power effect. These combinations of results make the new compressor an efficient option for using in high speed multiplier designs.
9
8
7
6
5
4
3
2
1
0
Conventional
Modified
Propagation delay (ns) Power(mw) PDP(propagation delay
product) (10^ 13 joule sec)
Graph 1a Performance comparison between the conventional and the modified 4:2 compressor on the basis of propagation delay, power consumed and propagation delay product.
0.6
Propagation delay (ns)
0.5
0.4
0.3
0.2
4×4
4×4 using 4×4 using modified
8×8
8×8 using 8×8 using modified
conventional 4:2 4:2 conventional 4:2 4:2
Graph 1b Variation in the propagation delay in the multipliers used.
180
160
140
120
100
80
60
40
20
0
4×4 4×4 using
conventional 4:2
Power(mw)
4×4 using modified 4:2
8×8 8×8 using conventional 4:2
8×8 using modified 4:2
Graph 1c Variation in the power consumed by the multipliers used.
1.00E10
8.00E11
Pdp ( propagation delay product) (js)
6.00E11
4.00E11
2.00E11
0.00E+00
4×4 4×4 using conventional 4:2
4×4 using modified 4:2
8×8 8×8 using conventional 4:2
8×8 using modified 4:2
Graph 1d Variation in the propagation delay product of the multipliers used.
.
REFERENCES

N. Sureka, Ms.R.Porselvi, &Ms.K.Kumuthapriya (2013).An efficient high speed Wallace tree multiplier. In the proceedings of International Conference on Information Communication and Embedded Systems (ICICES).

ShenFu Hsiao, MingRoun Jiang, &JiaSienYeh (1998). Design of highspeed lowpower 32 counter and 42 compressor for fast multipliers.Electronics Letters Vol 34 No.4.

Ohsang Kwon, Kevin Nowka ,& Earl E. Swartzlander, Jr (2002). A 16Bit by 16Bit MAC Design Using Fast 5:3 Compressor Cells.Journal of VLSI Signal Processing 31, 77 89, 2002.

WeinanMa,&Shuguo Li (2008).A new high compression compressor for large multiplier.In the proceedings of Solid State and IntegratedCircuit Technology, 2008.ICSICT 2008. [5]Baran D. Aktan M., &Oklobdzija, V.G. (2010). Energy Efficient Implementation of Parallel CMOS Multipliers with Improved Compressors.In the proceedings

of International Symposium on LowPower Electronics and Design (ISLPED), ACM/IEEE.

N. Ravi1, T. Jayachandra Prasad, T. SubbaRao, & M. Umamahesh (2010). Performance Evaluation of High Speed Compressors for High Speed Multipliers Using 90nm Technology.Recent Advances in Space Technology Services and Climate Change (RSTSCC).

R. Marimuthu, S. Balamurugan, Bala Krishna Tirumala, &
P.S. Mallick (2012). FPGA Implementation of High Speed Multiplier using Higher Order Compressors. 2012 International Conference on Radar, Communication and Computing (ICRCC).

AbdorezaPishvaie, GhassemJaberipur, & Ali Jahanian (2013).Redesigned CMOS (4; 2) compressor for fast binary multipliers. Canadian Journal of Electrical and Computer Engineering, (Volume: 36, Issue: 3 ).

Veeramachaneni S, Krishna M K, Avinash L, Puppala S R, &
M.B. Srinivas (2007). Novel Architectures for HighSpeed and LowPower 32, 42 and 52 Compressors.In the proceedings of 20th International Conference on VLSI Design Held jointly with 6th International Conference on Embedded Systems.

Chong, K.S., Gwee, B.H., & Chang, J. S. (2005). A micropower lowvoltage multiplier with reduced spurious switching. IEEE Transactions on VLSI Systems, 13, 255 265.

Mosch, P., van Oerle, G., Menzl, S., RougnonGlasson, N., van Nieuwenhove, K., &Wezelenburg, M. (2000). A 660lW 50Mops 1V DSP for a hearing aid chip set. IEEE Journal of SolidState Circuits, 35(11), 7051712.

Alioto, M., & Palumbo, G. (2002). Analysis and comparison on full adder block in submicron technology. Vry Large Scale Integration (VLSI) Systems, IEEE Transaction.

FlavioCarbognani, Felix Buergin, Norbert Felber, Hubert Kaeslin, & Wolfgang Fichtner (2008). A lowpower transmissiongatebased 16bit multiplier for digital hearing aids. AnalogIntegrCirc Sig Process, 56:512.

WenxinWang, ShawkiAreib, &MohabAnis (2007). Modeling leakage power reduction in VLSI as optimization problems.OptimEng (2007) 8: 129162.

C. S. Wallace, BA Suggestion for a Fast Multiplier,^ IEEE Trans. Electron. Comput., vol. EC13, 1964, pp. 1417.

ShiannRongKuang,,JiunPing Wang, &CangYuan Guo (2009). Modified Booth Multipliers With a Regular Partial Product Array. IEEE Transactions on Circuits and Systemsii: Express Briefs, vol. 56, no. 5.

Kelly Liew Suet Swee, & Lo HaiHiung (2012). Performance Comparison Review of 32Bit Multiplier Designs. In the proceedings of 4th International Conference on Intelligent and Advanced Systems (ICIAS2012).

T. Arunachalam,& S. Kirubaveni (2013). Analysis of High Speed Multipliers.In the proceedings of International conference on Communication and Signal Processing, April 35, 2013, India.

ChipHong Chang, JiangminGu, &Mingyan Zhang (2004). Ultra LowVoltage LowPower CMOS 42 and 52 Compressors for Fast Arithmetic Circuits. IEEE Transactions on Circuits and Systemsi: regular papers, vol. 51, no. 10.

PeimanAliparast, ZiaddinDaieKoozehkanani, AbdolhamidMoallemiKhiavi, GhaderKarimian, &HosseinBalazadehBahar (2011). A very highspeed CMOS 42 compressor using fully differential currentmode circuit techniques. AnalogIntegrCirc Sig Process, 66:235243.

ShuliGao, AlKhalili, D., &Chabini, N.(2009). Implementation of large size multipliers using ternary adders and higher order compressors. International Conference on Microelectronics (ICM).

H. ElRazouk,& Z. Abid (2006). Area and Power Efficient Array and Tree Multipliers. IEEE CCECE/CCGEI, Ottawa.