Glitch-Optimized Hybrid Booth Multiplier With Reduced Delay Asymmetry

doi:https://doi.org/10.5281/zenodo.18073171

Volume 14, Issue 12 (December 2025)

Glitch-Optimized Hybrid Booth Multiplier With Reduced Delay Asymmetry

DOI : https://doi.org/10.5281/zenodo.18073171

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 67
Authors : Dr. L. Thulasimani, G. Praveena, S. Balaganesh, Mugaleti Saikrithik, S.Premchand
Paper ID : IJERTV14IS120349
Volume & Issue : Volume 14, Issue 12 , December – 2025
DOI : 10.17577/IJERTV14IS120349
Published (First Online): 23-12-2025
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Glitch-Optimized Hybrid Booth Multiplier With Reduced Delay Asymmetry

Dr. L. Thulasimani, G. Praveena, S. Balaganesh, Mugaleti Saikrithik, S.Premchand

Electronics and Communication Engineering, PSG College of Technology, Coimbatore, Tamil Nadu.

Abstract– Multipliers are vital components of digital circuits extending from embedded system-on-chip (SoC) cores to GPU- based accelerators and hearing aids. With an increasing demand for battery operated devices, power efficiency is a primary design goal. However, issues like increased capacitive loads due to complex combinatorial blocks and spurious transitions due to unbalanced convergence will increase the power consumption of the multiplier. In the Booth multiplier proposed by Anuradha et al. [1], a 6T XOR – XNOR cell is used in the Encoder-Decoder circuit and in the adder-tree. But this cell does not produce XOR and XNOR inputs simultaneously and also, when all inputs face a

0 ->1transition, delay asymmetry occurs. Therefore, the subsequent stages of the adder tree are susceptible to larger deviations in delay due to lack of synchronization there by increasing the spurious activity which could finally produce erroneous outputs. To rectify this issue, the 6T XOR-XNOR cell which requires an external inverter is replaced with a cross- coupled 10T XOR-XNOR cell (capable of internal inversion) which has reduced delay asymmetry and power-delay product. This in turn reduces the glitches that occur in the circuit. In the partial product reduction stage, an interconnect scheme is proposed which optimizes the sum-carry asymmetry that exists in Wallace-reduction tree proposed in [1]. Both these improvisations are implemented in the proposed Booth multiplier.

Index Terms– Radix-4 Modified Booth Encoding, Glitches, Spurious activity, Optimized Wallace adder tree

Introduction

NOW-A-DAYS, portable electronic devices such as cellular phones, laptops and notebooks are used in various applications. For obtaining the best performance out of these electronic systems, optimization is carried out towards small size, high speed and energy-efficiency. These electronic systems mostly comprise arithmetic circuits. A Multiplier is a fundamental component in all arithmetic circuits. These arithmetic circuits which are used prominently in the data paths consume about one-third of power in microcontrollers and digital signal processors. Therefore, enhancing the performance of a multiplier improves the performance of the whole system considerably. The previously proposed high-performance multipliers are disadvantaged by increased load capacitance and glitchy outputs due to their complex combinational circuits and unbalanced arrival time of inputs which makes the multiplier a leading source of power dissipation. This spurious switching activity consumes up to 40% of the overall power consumption in the multiplier if input signals are left unsynchronized. The need is to reduce the spurious activities of the Radix-4 Booth Multiplier and improve the synchronization between sum and carry in the Adder Tree thereby reducing the power dissipation without any trade-offs in delay.

Various researchers have presented different multiplier and XOR-XNOR cell designs which aim to provide optimum performance. S. Venkatachalam et al. [2] presented a study on

Approximate Booth multipliers. In the paper, three Approximate Booth Multipliers (ABM-M1, ABM-M2, and ABM-M3) are proposed in which approximation is applied to the radix-4 modified Booth algorithm. An exact radix-4 partial product generator requires all three signals negi, twoi and zeroi to generate the partial product. For ABM-M1 and ABM-M2, an approximate partial product generator is designed using two out of the three signals, namely negi and twoi. In ABM-M3, the partial product generator uses only zeroi. When compared with other approximate Booth multipliers, the proposed Booth multipliers exhibit better accuracy and have better area-power product. ABM-M2 has the least Area-Power product. But there is an error value associated with the three designs.

H. Waris et al. [3] have proposed two Approximate Radix-8 Booth Multipliers (AxBM1 and AxBM2). Approximation is implemented using a 6-input lookup Table (LUT). In AxE1, ±3C are approximated to ±4C multiplicands. In AxE2, +3C is approximated to 4C and -3C multiplicand is approximated to 2C. AxBM2 has 49% and 26% improvement in delay and PDP compared to previous Approximate Booth designs. Though the hardware metrics compared to exact Booth Multiplier is similar,it offers error-energy trade off. Y. J.

Chang et al. [4] have presented a radix-4 Booth multiplier with pre- encoded mechanism to reduce the power consumption of the Booth multiplier. The proposed design reduces the power consumed by disabling the Boothencoders and decoders when not required. 0C case will be pre-processed by the pre-encoder so that the encoder and decoder only need to process the remaining cases (±1C and ±2C). Compared to the ABM-M1 design [2], the delay reduction of this multiplier is 18.7%. But, the pre-encoder circuitry increases the overall parasitic capacitance and area overhead.

S.R. Kuang et al. [5] have proposed a Radix-4 based Booth Multipliers with an improved partial product array. The partial product array is generated with fewer partial 6 CHAPTER 2 LITERATURE SYRVEY product rows (4 instead of 3) by introducing additional signals like tau and epsilon generated from the multiplier and multiplicand. neg bit is removed thereby reducing the number of rows in the array. The proposed 8-bit design has achieved 18.5%, 4.6%and 12.0% reduction in terms of power, delay and area than conventional partial product array based multipliers.

K.-S. Chong et al. [6] have proposed a less energy consuming 16-bit Booth leapfrog array multiplier using dynamic adders. Spurious switching increases when input signals arrive at different times to the adders. The array multiplier has glitches due to lack of synchronization though it requires less area. The leapfrog structure improves this synchronization by skipping the sum by one row as it is produced laterthan the carry. A dynamic adder has also been proposed to synchronize the inputsto the adder. On comparison with array multiplier designs and static leapfrog designs, the proposed multiplier has reduction in energy- delay product (EDP) by up to 34%. It also has 33% reduction in area when compared with the dynamic array multiplier.

Naseri. H and Timarchi. S [7] have proposed a high speed and low power Full Adder by implementing different XOR and XNOR Gates. These XOR/XNOR cells have full swing and produce simultaneous XOR, XNOR outputs. Using the best cell, six different hybrid full-adder (FA) circuits are implemented. The transistor sizing is used to optimize the PDP of these circuits further. The proposed 12T XOR XNOR cell offers 16.2% to 85% improvement in PDP. The proposed full adder using transmission gates and 2-1 MUX improves PDP and EDP by upto 23.4% and 43.5% respectively. Furthermore, this cell has better stability at a voltage range of 0.65-1.5 V in comparison with other FA cells. K.-S.

Chong et al. [8] have proposed a 16-bit multiplier for power critical applications like hearing aids. The micro power consumption is achieved by reducing the spurious switching in the Reduction tree stage of the multiplier. In this design, latches are used to synchronize inputs to the adders in a predetermined sequential arrangement. Though the proposed 16-bit design is 20% slower, it dissipates approximately 32% less power and has 20% better EDP than conventional 16×16-bit multipliers. The 32bit design implemented in a similar manner dissipates

53% less power and has 39% better EDP than conventional general multiplier in spite of reduction in speed
Proposed radix-4 booth multiplier

Radix-4 encoding based Booth multiplier can perform high-speed signed multiplication. This multiplier can be split into Booth Encoder which generates Xi and 2Xi signals, Partial Products Generator (PPG) which comprises of Decoder and other intermediary signal blocks essential for signed multiplication, Partial Product Reduction tree which adds the partial products in a particular fashion and a final Parallel adder. The Radix-4 encoding scheme reduces the number of partial products to be added by two fold when compared to Radix-2 conventional encoding scheme but the circuit suffers from delay asymmetry that in turn causes spurious activities to take place in the circuits. Unwanted transitions result when input signals change state and take different times to arrive at a common point in the circuit. This could be due to the presence of one or more paths existing in a circuit in which one path is longer than the other. It could also due to gate delays which produce the required signals. These glitches increase as the number of stages increase in the circuit. These adversely affect the output thereby a wrong product which is undesirable.

In the proposed 8-bit Radix-4 Booth Multiplier as given in Fig 1, 4 Booth encoders are used. Both the multiplier and multiplicand are 8-bit signed numbers with a multiplication range from -128 to +127. The encoders take input in a one-bit overlapping pattern. The PPG block consists of intermediate signals and 32 partial product terms generated by 32 decoders.

The partial products and intermediate signals required for signed multiplication are generated. This is explained in Section C. The decoders receive inputs from the encoders and, the multiplicand and multiplier. As a result, 40 bits are available, which have to be added to get the product.

Fig. 1. Block Diagram of Proposed ALU

The remainder of this section is split as follows; Section A discusses the 10T XOR-XNOR cell by Jyoti Kandpal et al. Section B discusses the modified Booth encoder, Decoder and Full adder. Section C discusses the Wallace reduction tree, Section D discusses the Intermediatory signals needed for signed multiplication and Section E discusses the final integration of multiplier
1. Cross-Coupled Hybrid XOR-XNOR Cell
  
  A 10T XOR-XNOR cell has been proposed by Jyoti Kandpal et al. using Complementary pass transistor logic and Cross coupling as given in Fig 3.3. This cell has internal inversion capability and can produce simultaneous outputs. It uses two pMOS (P1 and P2) and three nMOS (N3, N4, and N5) transistors at the XOR output side and two nMOS (N1 and N2) and three pMOS (P3, P4, and P5) at the XNOR output side. The circuit schematic has been shown in Fig 2.
  - At XOR side: P1 and P2 are connected in parallel as Pass Transistors, N4 and N5 as restorer transistors to provide full swing output and N3 as a feedback transistor.
  - At XNOR side: N1 and N2 transistors are connected in parallel as Pass Transistors, P4 and P5 as restorers to provide full swing output and the P3 as a feedback transistor. This circuit provides full swing XORXNOR outputs simultaneously using the feedback circuitry without the need for an external inverter.
    
    Fig. 2. Cross-Coupled 10T XOR-XNR Cell
2. Glitch Optimized Booth Encoder, Decoder and Full Adder
  
  The cross-coupled XOR-XNR cell from Section A has been utilized in BED circuits in order to reduce the delay asymmetry and thereby reducing glitches and hence improving the PDP.
  
  Fig 3 shows the proposed Booth encoder circuit that doesnt need any external inverters in the Xi path as opposed to the use of an additional inverter by Ranasinghe et al. The improvement in the performance has been discussed in the Result and analysis section
  
  Fig. 3. Proposed Internal-inversion based Booth Encoder
  
  When this glitch-optimized Encoder is used in the Decoder circuit shown in Fig 4, the convergence of signals is improved, there by glitch gets reduced and hence the PDP gets reduced.
  
  Fig.4. Modified Booth Decoder Circuit
  
  The general structure of a hybrid full adder with three modules is shown in Fig 5. The modules are interconnected to produce the Sum and Carry.
  
  Fig. 5. Hybrid Full adder
  
  Module I: XOR-XNOR outputs are produced simultaneously using A and B inputs.
  
  Module II: Uses the output from Module 1 (XOR) and the third input, CIN to produce SUM.
  
  Module III: Uses the output from Module1 (XNOR) and the third input, CIN to produce CARRY. The proposed Hybrid Full adder is designed using the glitch optimized 10T XORXNR cell for Module 1 and transmission gates for Modules 2 and 3. It requires 28 transistors. Building Modules II and II I with TG based logic aids for faster operation of the circuit.
3. Wallace Reduction Tree
  
  Wallace reduction tree uses full adders and half adders to group the partial product terms in stages. The Partial Product array of 4 rows is reduced to 2 rows in two consecutive stages. In between the stages, product bits 0-2 are obtained as outputs. The stage wise reduction of partial product terms is indicated by the dot diagram in Fig 6.bits are reduced to 34 bits which is further reduced to 2 rows of 12 bits each that is added with a parallel adder. The leapfrog adder tree improves the sum-carry synchronisation by skipping the sum output by one stage. This interconnect scheme can be adopted in Wallace tree reduction for better delay. In the full adder, Module 1 generates an intermediate result which along with the third input, Cin, produces the outputs. So, Carry which is produced faster than Sum in the previous stage is utilised first (in Module 1) to produce a result which then works with Sum (given to modules 2 & 3). Through this, the time to wait for the Sum (from the previous stage) to arrive at module 1 can be reduced. This scheme requires no additional circuitry and improves the synchronisation and PDP significantly
  
  Fig. 6. Dot diagram of Wallace reduction tree
4. Sign-Extension bits and other Intermediatory Signals
The Booth multiplier makes uses of various intermediatory signals for sign extensions and Carry generation. Fig 7 shows the partial product array of the proposed Booth multiplier. It can be seen that along with the partial products Pij, there are various other extension bits.

These sign extension bits were derived from the expressions shown in Fig 8.

Fig. 7. Partial Product Array

Fig. 8. Expressions for intermediatory signals

Results And Analysis

The simulation and result analysis are conducted for different XOR-XNOR cells, Full Adders, Booth Encoder- Decoder circuits and Partial product reduction trees. Comparative Analysis has been done for all the circuits in terms of transistor count, power, delay, delay asymmetry and power-delay product. The Booth Multiplier has been designed and tested for all critical conditions.

Simulation Result

Test Bench Circuit

To validate the circuit a test bench consisting of buffers and load capacitances is used along with the Device Under Test (DUT). The importance of test bench validation is that the DUT will always be used in combination with other devices to build a larger system and the static inverters-based Test benches are a good generalization for any operating scenario to be considered. The test bench circuit shown in Fig 9 is used for performance analysis of Full Adder and BED circuits. Because of the input buffers, the signals experience some degradation. Buffers are added at the output too. In addition to these output buffers, the DUT (Device Under test) has a capacitive oad connected to it. The capacitive load is equivalent of fan-out of four CMOS inverters (FO4).

Fig. 9. Test Bench used for verification

Fig. 10. Voltage response of (i) Encoder and (ii) Full adder under Test bench

The test bench is tested for a voltage range 0.6-1.2V with the Full adder and decoder as the CUT. Input received from a small load is detected by the circuit and fanned out to a larger load. Glitches observed at the output buffer were the same as those seen under non-loaded conditions. From the analysis in Fig 10 (i) and Fig 10 (ii), it can be inferred that both the circuits have good driving capability.

XOR-XNOR CELLS

Implementation of XOR-XNOR cells of different transistor counts and analysis of all the cells in terms of delay asymmetry, power, propagation delay and hence the power-delay product (PDP) has been carried out. There are 2 critical paths: XOR and XNOR. For the calculation of PDP, the worst case delay of XOR and XNOR outputs is taken.

TABLE V

PERFORMANCE COMPARISION OF DIFFERENT XOR-XNOR CELLS

Architecture	Power (W)	Delay (ns)	Delay Standard Deviation (ns)		PDP (fJ)
Architecture	Power (W)	Delay (ns)	XOR	XNOR
Transmission gates [1]	47.392	0.1122	0.0260	0.0935	5.319
CPL logic and feedback restorer transistors	52.021	0.1388	0.0590	0.0219	7.223
Double Pass Transistor Logic	46.079	0.1271	0.2090	0.2751	5.858
Cross- coupled, CPL based	81.913	0.1400	0.3393	0.0720	11.47
Full swing logic	37.7	0.1385	0.02856	0.07375	5.222
Cross- coupled, pass transistor logic	64.172	0.1040	0.04769	0.02177	6.678
Cross- coupled, Feedback loop	59.665	0.1607	0.07950	0.02527	9.588
Cross- Coupled, CPL based	26.684	0.1682	0.01490	0.06431	4.489

The reduction in delay asymmetry reduces the spurious activities in the overall circuit. This reduction in glitch reduces the overall power consumption of the circuit. For a complex multiplier like Modified Booth multiplier, the reduction in glitch can greatly reduce the power consumption and thereby reduces the possibility of erroneous output. Through Table 4.27, it is seen that the cross-coupled, CPL based 10T Cell offers better delay asymmetry than the design proposed in [1]. It also provides at least 29.22% and 15.61% improvement in power and PDP respectively than all the other designs.

Modified Booth Encoder and Decoder

Implementation of encoder and decoder cells of different transistor counts. Analysis of all the cells in terms of power, propagation delay and hence the power-delay product (PDP).

From Table III, the BED18 proposed by Anuradha et al. in [1] offers at least 12.5% and 11.45% improvement in terms of delay and PDP than other designs. The power consumption and the delay asymmetry can be further reduced with the help of low power consuming XOR/XNOR cells.

TABLE III

PERFORMANCE COMPARISON Of BOOTH CIRCIUTS OF DIFFERENT ARCHITECTURES

Booth Circuit	Power(W)	Delay(ns)	PDP (fJ)
BED24	145.1576	0.29157	42.3243
BED20	222.235	0.22792	50.65091
BED22	253.388	0.28083	71.00691
BED18 (Erroneous)	217.081	0.25298	54.91835
BED18	177.589	0.20258	37.9759

Full Adder Cells

Full adders of different transistor counts have been implemented. Analysis of these circuits is carried with respect to power and propagation delay. There are 2 critical paths: SUM and CARRY. For the calculation of PDP, the worst-case delay between sum and carry outputs is taken. Power analysis of the full adders under certain glitchy scenarios has also been carried out.

From Table iv it is clear that RFL22 consumes the most power due to many glitches in the output. CMOS28 consumes the least power, however due to high node capacitance, its efficiency reduces in cascaded stages.

HFA26 has very high delay asymmetry despite its faster operation. The full adder PBFA26 proposed in [1] by Anuradha et al. outperforms most of the other adders by at least 10.26%.

TABLE IV

Performance Comparison Of 5 Different Parallel AdderS

Full Adder	Power	Propagation Delay (ns)	PDP (fJ)
RFL22	105.003	0.15088	15.7588
TFA22	75.235	0.21345	16.058
BFA22	76.8341	0.21322	16.3825
HFA26	64.935	0.21182	13.7542
CMOS28	57.237	0.16501	9.44465
PBFA26	83.186	0.1921	15.97

a. Glitch Analysis of Full Adder

The circuit is tested under the same scenarios as in the previous full adder analysis and is compared with those designs.

Fig. 11. Glitch analysis of Full adders

As shown in Figure 11, PBFA28 has better power efficiency in the above scenarios compared to the other adders. In cases 1 and 2, the power consumed is less than the peak power consumption and hence the circuit is not affected by glitches in those cases. Hence, the proposed Full Adder has the intended reduction in glitches due to reduction in delay asymmetry with no lessening in performance.

Adder Trees

Leapfrog reduction helps in synchronizing sum and carry better thereby reducing delay. Though it requires more half adders, the reduction in full adder has reduced the power consumption. Array and Leapfrog based trees have a greater number of stages which increases the overall power consumption with respect to Wallace tree. The interconnect scheme utilized in the Wallace tree (carry before sum) improves the delay further. The modification in the interconnect structure in the reduction tree stage improves the speed and PDP by 5.61% and 5.27% respectively from the data in Table VI.

TABLE VI

PERFORMANCE COMPARISION OF WALLACE, ARAY AND LEAPFROG REDUCTION TREES

Wallace Reduction Tree

Without

13.221

191.942

With

12.480

181.8211
Final Multiplier

The final Booth Multiplier where the PPG Generator block is connected together with the Wallace reduction Tree is shown in Fig 12..

Fig. 12. Cadence implementation Proposed Booh multiplier

Fig. 13. Transient Analysis of Multiplier

The transient analysis response of the proposed multiplier tested for all corner cases is given in Fig 13. The maximum power delay readings for the same are shown in Table VII. Delay is calculated for the longest path from B1 to P14.

TABLE VII

TRANSIENT ANALYSIS OF MULTIPLIER

From Table 4.44, it is seen that the final multiplier consumes the power of 2.8656 mW with a Power-Delay Product of 2.1912 pJ.

PROCESS CORNER ANALYSIS

Booth-Encoder Decoder

Process corner analysis takes multiple dimensions into account and puts a box around what will come out of the foundry. Two letter designation is used to describe the different corners like FF (fast fast), SF (slow fast), SS (slow slow), FS (fast slow) and NN (nominal nominal). Running for the 5 corner cases, mean delay and power are found to be 205.278ps and 114.23 W respectively. These values are closer to the typical value. Hence, the proposed BED22 has adequate design margin. The standard deviation of the values of delay for BED18 and proposed BED22 are 27.069ps and 28.144ps respectively. The deviation among the five process corners is comparatively lesser for the proposed circuit. For BED22, the NN case gives a PDP of 24.0771fJ. The highest difference is between SS and FF which is 9.596fJ (27.4335-17.837). Uneven corners give higher delay as well as power because of imbalanced switching of the transistors. The performance of the circuit is dependent on both nMos and pMos transistors.
Proposed Full Adder

Running for the 5 corner cases, mean delay and power are 170.36ps and 83.933 W. These values are closer to the typical value as seen from Table 4.39. The standard deviation of the values of delay for PBFA26 and Proposed PBFA28 are 24.664ps and 22.542ps respectively. The asymmetry among the five process corners is comparatively lesser for the proposed circuit. When this full adder is utilized in the reduction tree, asymmetry among all the full adders would be reduced. For PBFA28, the NN case gives a PDP of 13.4041fJ. The highest difference is between SS and FF which is 5.6593fJ (17.1283- 11.469). Performance at the uneven corners (FS and SF) which could give unbalanced switching is also improved.

Conclusion

A hybrid Booth multiplier was designed using a hybrid full adder and Wallace reduction tree. XOR-XNOR cell acts as base circuit for the construction of adders and, encoders and decoders. Different architectures of XOR-XNOR cells were compared and a cross-coupled 10T XOR-XNOR cell outperformed all the other cells in terms of PDP and issues pertaining to asymmetry were solved in this architecture. On comparison of the existing BED18 with the proposed BED 22, the proposed Booth Encoder offers a 7.716% and 71.87% reduction in terms of power and PDP respectively. The Booth Decoder provides a 24.947% and 24.41% reduction in terms of power and PDP respectively. The Hybrid full adder with 10T XOR-XNOR cell offers 4.11%, 12.57% and 16.265% improvement in power, delay and PDP to the design in [1]. On analyzing the glitch performance, the proposed PBFA28 had improvements if three out of four scenarios. On testing for 5 process corner cases, the deviation among the five process corners is comparatively lesser for the proposed circuits. In the reduction stage, to synchronize Sum and Carry, an interconnect scheme is designed. This scheme improved the overall delay of the Booth multiplier as the waiting time for the sum is reduced. For the final addition, a ripple carry adder is used which has better PDP than the Carry look-ahead adder. The final multiplier circuit was integrated and PDP analysis was conducted for the same.
References

A. C. Ranasinghe and S. H. Gerez, "Glitch-Optimized Circuit Blocks for Low-Power High-Performance Booth Multipliers" in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 28, no. 9, pp. 2028- 2041, Sept. 2020, DOI: 10.1109/TVLSI.2020.3009239. (BASE PAPER)
S. Venkatachalam, E. Adams, H. J. Lee and S. Ko, "Design and Analysis of Area and Power Efficient Approximate Booth Multipliers," in IEEE Transactions on Computers, vol. 68, no. 11, pp. 1697-1703, 1 Nov. 2019
H. Waris, C. Wang, W. Liu and F. Lombardi, "AxBMs: Approximate Radix-8 Booth Multipliers for High-Performance FPGA-Based Accelerators" in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 68, no. 5, pp. 1566-1570, May 2021
Y. J. Chang, Y. C. Cheng, S. C. Liao and C. H. Hsiao, "A Low Power Radix-4 Booth Multiplier With Pre-Encoded Mechanism," in IEEE Access, vol. 8, pp. 114842-114853, 2020, DOI: 10.1109/ACCESS.2020.3003684.
S.R. Kuang, J.P. Wang, and C.Y. Guo, Modified booth multipliers with a regular partial product array IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 56, no. 5, pp. 404 408, May 2009.
K. S. Chong, B. H. Gwee, and J. S. Chang, Low energy 16-bit Booth leapfrog array multiplier using dynamic adders IET Circuits, Devices and Systems, vol. 1, no. 2, pp. 170174, 2007.
Naseri. H., & Timarchi. S. (2018) Low-Power and Fast Full Adder by Exploring New XOR and XNOR Gates, IEEE Transactions on Very

Large Scale Integration (VLSI) Systems, 26(8), 14811493.
Chong K.S., Gwee B.H. & Chang J.S. (2005),A micropower low-voltage multiplier with reduced spurious switching, IEEE Transact ions on Very Large Scale Integration (VLSI) Systems, 13(2), 255-265.
G. Goto et al., A 4.1-ns compact 54×54-b multiplier utilizing sign-select Booth encoders, IEEE J. Solid-State Circuits, vol. 32, no. 11, pp. 1676 1682, Nov. 1997.
W.-C. Yeh and C.-W. Jen, High-speed booth encoded parallel multiplier design, IEEE Trans. Comput., vol. 49, no. 7, pp. 692701, Jul. 2000.
Z. Huang and M. D. Ercegovac, High-performance low-power left- to-right arraymultiplier design, IEEE Trans. Comput., vol. 54, no. 3, pp. 272283, Mar. 2005.

Dr.L.Thulasimani is currently the Asst. Professor in Department of Electronics and Communication Engineering, PSG College of Technology. She completed her BE in ECE from Coimbatore Institute of Technology, Coimbatore in the year 1998 and Post graduate in ME Applied Electronics from Coimbatore Institute of Technology, Coimbatore in the year 2001. Received her PhD award from Anna University, Chennal in the year 2012. Dr.L.Thulasimani is a Member of IEEE. She is also a prominent

member of MISTE and MCSI. She has over 20 publications out of which 8 are in International journals and others in international and national Conferences. Her research area includes Wireless communication, wireless security, RF systems and Cognitive radio.

G.Praveena (18L137), Department of Electronic and Communication Engineering, PSG College of Technology. Her areas of interest include VLSI, EDA, FPGA.

S.Balaganesh (19L402), Department of Electronic and Communication Engineering, PSG College of Technology. His areas of interest include VLSI, EDA, FPGA. He is also a member of IETE.

Sai Krithik Mulagaletti (18L143), Department of Electronic and Communication Engineering, PSG College of Technology. His areas of interest include VLSI, EDA, FPGA.

S.R.Premchand (19L401), Department of Electronic and Communication Engineering, PSG College of Technology. His areas of interest include VLSI, EDA, FPGA.


Wallace Reduction Tree	Without	13.221	191.942
Wallace Reduction Tree	With	12.480	181.8211