 Open Access
 Total Downloads : 100
 Authors : Sureshbabu J
 Paper ID : IJERTV8IS040262
 Volume & Issue : Volume 08, Issue 04 (April – 2019)
 Published (First Online): 20042019
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
A Reduced Complexity Wallace Multiplier using a Novel Power Efficient Reversible Peres Logic Gate Adders
Sureshbabu J
Research Scholar (Part time) Anna University Chennai, India
Abstract Multipliers play a vital rule in most of the digital systems currently implemented. Multiplier is a major hardware block in all the current microprocessor, Digital image processors etc., Whit advancement in the VLSI technology which strives to reduce the die size and extend the life of the power source new methodologies are introduced. These multipliers should have the ability to occupy less area handle bulk data width with less processing time and to consume less power in all. However reduction in speed will in turn increases the area of the die. We had in this paper opted for a design which shall be optimum for both the factors. To supress the Delay, Area and also the power dissipation, with an increased fan out capabilities, reversible Peres logic gates are employed to implement the Full adders that are implemented in the Reduced Complexity Wallace Multipliers. The existing Reduced Complexity Wallace multipliers are implemented with only Full adders and half adders, only (8%), are used in the reduction to merely reduce the stages of reduction. As the FAs are higher we had proposed to a methodology for the reduction of the Delay and the consumption of the power by implementing the Full Adders by Peres logic gates. The paper concentrates on the PP reduction stages and the Carry Propagate adder is used as the final adder, similar to that used in RC Wallace Multiplier. This method has a reduction of power consumption by 89%, propagation delay by 91.67% and the area by 4.29% than the existing RC Wallace Multiplier.
Keywords High speed multiplier, Peres full adder, Peres logic gates, Reduced complexity Reversible logic gates, Wallace Multiplier.

INTRODUCTION
Multipliers are one of the most significant mechanisms implemented in several applications. Multipliers have a crucial role in the very high speed image processing fast Fourier transform (FFT), digital signal processing (DSP) etc. An FFT requires a large number of Multiplications.Dedicated MAC and ALU architectures unit (Arithmetic and Logic Unit) are needed for the carrying out these algorithms. These signal processing applications consume substantial amount of energy and demand greater computational capabilities also. As the scale of integrations grows, it insists for further sophisticated processing systems to be implemented on a VLSI chip, which consumes higher energy mean while demanding for a superior computational capacity. The steady growth of operating frequency and processing capability per chip deliveries more heat in the IC chip which in turn consumes more energy. It
also requires an efficient cooling technique. The next constrain is to extend the life of the power sources in the current portable devices. Addition and multiplication are the essential operations in most of the devices. Many researches are done for the design of high efficient multipliers, which consumes less power and area and in the meantime operate at high speed. Wallace multiplier falls under such an ultra high speed multiplier [2].
The complete progression of multiplication is carried out in three steps: The first is formation of partial product (PP), rearranging the partial products generated into asset easy for processing and the final a stage ripple carry adders. Generally N2 partial products are to be summed for an N bit to N bit multiplication in the Wallace multiplier. The initial process is to group the entire partial product rows into sub groups of 3 rows each. In the standard Wallace reduction method CSA are implemented which utilizes the Full adders and Half Adders to condense the entire N row of PP matrix to a two row matrix. This stage is added through an m bit CPA to generate the final multiplication result. The FAs and HAs used in the CSA are feed with PPs of the adjacent three rows and the carry from the previous set of adder is not feed to this FAs and HAs. Thus it reduces the three rows in to two rows of carry and sum with a higher bit level. These carry and sum are again processed in the next stage with the next row from the PPs. The columns with three input are added by a FA and the columns with two PPs in the column of the sub grouped rows are processed by a HA in the standard Wallace multiplier. The number of Half adders utilzed in the Wallace multiplier are reduced to almost 83% in the modified Wallace method.
A modified Wallace method is realized for the partial product reduction [1]. The modified Wallace reduction method splits the partial product matrix into a sub group of three rows and full adders are used for each group of three bits in a column. When the particular column has only one or two bits, it is merely passed over to the next stage. Half adders are implemented in this method only to reduce the number of stages and maintain that of conventional Wallace; mostly utilisations of half adders are done at the final stages only. The 16 bit modified Wallace multiplier utilizes 201 Full adders and merely 9 Half adders. From this it could be understood that the whole operation of the PP reduction is entirely depended on the Full adders and any design in the FA which
could reduce the area and delay will have a key factor in the multipliers operation. In this paper we had designed a Full adder using a Reversible logic gates, which consumes very less area and even the dissipation could be made to zero [3], instead of the conventional Full adders. We utilise a Full adder configured by Peres logic gates.
Fig. 1. Modified Wallace Reduction

REVERSIBLE LOGIC GATES
The performance of the system is limited with the large amount of heat that is released in the chips with high processing speed and data length and the research are done to overcome these restrictions. The practical limitations for the performance improvement of the high performance chip are the excess heat released by the system. The reversible computing methodologies will be able address the solutions for this heat dissipation in a chip. It conserves the information by the process of noncomputing the input bits rather than
throwing them away. The energy efficiency could also be attained by the reversible computing techniques. This would make the design of the circuit elements to reduce their sizes to the maximum extent which makes it to be least complex in area. This technique makes it appropriate for the compact portable devices with less heat dissipation. The developments in the VLSI technologies for a more compact die area will certainly need a high design cost which shall be dominated by the efficiency in power and heat dissipation
Reversible logic design is significantly different from the conventional Gates. Most of the conventional gates are irreversible in nature. Reversible logic gates are devices with equal input and output terminals with a mapping of oneto one. The reversible gates are unique as the values of the inputs could be regenerated from the outputs. These features demand for equal number of input and output as it supports for recovery of the inputs. Direct fan out is allowed in the synthesis of the reversible gates however the fan out of the reversible gates are achieved through additional gates. The major constrain in these type of gates are the garbage outputs, it refers to the unused outputs generated in a reversible logic circuit. This could not be avoided completely as these are needed to achieve the reversibility of the gates and the constant inputs, which is maintained at either logi 0or 1 in order to synthesis the logical function. While design of these gates the number of the garbage and the constant output should be as minimum as possible which avoids the unneeded output or inputs. The major challenges faced in current circuit design lies in reducing the number of gates required, delay and the quantum cost. The reversible gates quantum cost is the cost of the primitive gate. It is premeditated by considering the primitive reversible logic gates to be realized. The reversible functions are only utilized by these gates. But most of the Boolean functions are not reversible, so these functions are to be transformed to reversible one by introducing additional inputs which are maintained at high or low called the constant inputs.

Peres logic gate
Peres gate is a 3Ã—3 reversible circuit with a onetoone mapping. The quantum cost of Peres gate is 4 in the traditionalbased design [6]. The corresponding Quantum logic symbol of Peres is shown in fig. 3 with cell inputs
{A,B,C}, and cell outputs {P,Q, R}. It is achieved by employing arranged and interacted QCA cells to perform logical AND and XOR functions [29]. It is clear in this design that it is built using two XOR structures in addition to single AND structure, whose output is connected directly into one of the inputs of the second XOR structure. As can be deduced from this implementation, the output Q is assessed by the reverse value of input B when input A = 1 and otherwise, Q =

On the other hand, the output R is evaluated by the reverse value of input C when result of AB = 1 and otherwise, R = C. The logical operation of proposed QCA cellbased Peres gate is shown in Table 1. It is worth mentioning that the proposed gate could be implemented in a single layer without using any rotated cell or translated cell [8].
P=A. (1)
Q=AB (2)
R = AB C (3)
requires merely one clock cycle for its execution. Since the quantum cost of the FA is less it is widely suitable to be implemented in the present CMOS technologies. This adder has merely one constant input kept at logic 0 and two garbage outputs.
(4)
(5)

P

Q

R

Fig. 2. Quantum logic symbol of Peres
TABLE I. TRUTH TABLE OF THE PERES LOGIC GATE.
Inputs
Outputs
A
B
C
P
Q
R
0
0
0
0
0
0
0
0
1
0
0
1
0
1
0
0
1
0
0
1
1
0
1
1
1
0
0
1
1
0
1
0
1
1
1
1
1
1
0
1
0
1
1
1
1
1
0
0
B. Peres logic Full Adder
An adder is a logic circuit which adds two or more variables to provide the sum and consecutively the carries depending on the inputs taken into consideration. Adders are an unavoidable part of logic circuits as they are vital in most applications they are not only used for addition but also to multiply, calculate the addresses, increment operations, table indices, etc. Most common adders operate on binary numbers although they can be constructed for BCD, excess 3 formats, etc.
Design of an effective adder with less delay, area and least power dissipation will certainly reduce the overall performance of many circuits. Here two 3 x3 Peres logic gates are used to implement a Full Adder. This implementation of FA with the reversible gates is efficient in term of the gate count, hardware complexity, garbage output and constant input in comparable to the other reversible gates that could be designed. The Quantum cost of these type of adders are 8, since two Peres gate are needed for the adders to be formulated. The Peres Full adder is a 4×4 gate which has four inputs and outputs. The hardware complexity is low and
Fig. 3. Full Adders using Two Peres Gate
Before proceeding to the reduction process the maximum number of stages in the reduction could be derived. In the reduction of the modified Wallace multiplier shown here without the half adder in the first stage, the second matrix would have an excess stage to that of the conventional Wallace. The stages requiring HAs are arrived by the method, has said previously it requires HAs in few intermediate stages and in the final stage.


PROPOSED WALLACE REDUCTION
The method projected in this paper reduces the time delay of the multiplier with high fan out capabilities. As the number of bits is higher in the multiplier, curbing of delay and the area are to be restricted to the minimum level to make the circuit portable and efficient enough to operate for a longer duration with the available power source. We had designed the FAs used in the Wallace tree reduction with the dual Peres logic gate. The multiplicand and multiplier are multiplied with a conventional AND operation to produce the partial product matrix. The matrix thus generated would have 16 rows with 16 elements in each row. To make the reduction procedure of the matrix simple and efficient the same methodology adopted in the modified method of Wallace reduction is used in this approach also. Rearranging the rows are done in such a manner that all the elements in the column are pushed upward to the first row, thus the entire matrix now has maximum elements in the first row and gradually reduced elements in the adjacent rows and the final row has one element. Now the PPs are processed with the FAs and HAs. The rows are split up into levels of three rows each and added with FAs were there are three PPs in the column and those columns with one and two PPs are transferred to the next stage. The half adders are used rarely in the initial stages or in intermediate stages to ensure the number of stages is not exceeded than the
conventional one. Mostly the half adders are used only in the final stage of reduction.
Prior to working through the reduction procedure, the maximum number of rows, ri, in ith stage of the reduction can be derived. The number of rows in the succeeding stages of the Wallace multiplier is given by
r i+1 = 2[ ri /3] + ri mod 3 (6)
Where, y mod z denotes the smallest nonnegative remainder of y/z. For a 16bit by 16bit multiplier, the stage heights given by (4) are: r0 = N = 16, r1 = 11, r2 = 8, r3 = 6, r4 = 4, r5 = 3, r6 =2 . Therefore, seven stages are required for the reduction including the final CLA adder. Thus it requires 6 stages of reduction. As perceived in the reduction of the 16 bit modified Wallace multiplier shown in the Figure.1, one Half adder is required in fourth stage and two in the fifth stage to maintain the rows in the corresponding stages as concluded by the above equation
All FAs in the PP reduction are executed using the Peres logic gate. The configuration of these Peres logic gates are simple and the output from them are feed to the next stages with ease since the fan out capabilities of the reversible gates are high than the conventional Adders. The additional Garbage outputs are left out. The constant input needed is also logic low which further reduces the dissipation of power since it could be logically grounded. Since the Full adders are implemented as one logic were the inter linking of the Peres logic is internally done. The CI are inter connected from the precding gate so the device has one CI only and the GO are from each Peres logic is untouched.
The final addition of the multiplier, were it has two rows of data after the reduction process are completed in the reduced modified Wallace multiplier is done by Carry propagate adder or a Carry look ahead adder. Carry Propagate adder ( CPA) performs addition by propagating a carry from each bit to higher bit position. These adders dont occupy a considerable area of chip and it also consumes very less power. It adds two n bit operands and an optional carryin (cin) by performing carry propagation. This could be realized by a combinational circuit using m bit ripple carry adders ( RCA ). Here it uses a 2N2 bit adder. The computational time grows linearly with the operand word length N. The speed up of the operation of CPA would require replacement by some faster adder structures.

RESULT ANALYSIS
The second phase reductions of multipliers for various sizes were analyzed using the modified Wallace approaches with Conventional Adders and the Peres logic Adders. Designs were made for each size with generators programmed in VHDL for the XILINX 14.2 ISE and the output were implemented in Spartan6 FPGA family
TABLE II. COMPLEXITY OF THE WALLACE MULTIPLIERS.
INPUT SIZE (N)
8
16
32
STAGES
4
6
8
WALLACE
FULL ADDER
38
200
906
HALF ADDER
15
52
156
TOTAL GATES
402
2008
8778
MODIFIED WALLACE
FULL ADDER
39
201
907
HALF ADDER
3
9
23
TOTAL GATES
363
1845
8263
TABLE III. PARAMETER ANALYSIS
Power in mw
Delay in nsec
LUTS
Modified Wallace
1628.134
37.416
420
Modified Wallace with Peres gate Adders
135.678
3.996
402
Fig. 4. Power comparison
Fig. 5. Delay comparison
Fig. 6. LUTs utilized
For the analysis done the no of full adders used was
0.2 percentages higher than the conventional Wallace and the Half adders were almost 83% lower than the conventional Wallace. The complexity and the power dissipation were compared between the modified Wallace with that of the conventional adders and the Adder implemented through the reverse logic Peres gates. It was established that the Complexity of the Peres logic implemented Wallace was merely 4.28% lesser than with the conventional adders. The power dissipation was 3.996 ns, which is almost 89.32% lesser. The delay was reduced by 91.67%. The same is tabulated in table 2 and the corresponding charts are displayed in the figures.

CONCLUSION
This paper presents a modification to the Modified Wallace multiplier to be implemented with the adders designed with Peres Logic gates. The modified Wallace with the Peres logic Adders are compared with the Wallace multiplier with conventional Adders. Here both the modified and the proposed multiplier has the same number of HA and FAs. The analysis is done on the time delay, area and the power dissipation. Based on the result analysis the proposed multiplier with Peres logic gate has very low delay as the addition of three bits are done in two clock cycles. The simulation result shows that the overall delay of the multiplier is reduced by 89.32% and the speed is increased by 91.67%. It is found that the Peres logic Adders needs one Constant input which is wired to logic zero so no major loss or additional heat is generated for this additional input added to the circuit. Similarly the Adder has two Garbage outputs which are due to the additional internal operation of the adder; as such this additional Garbage output which is not utilized for the adder has no critical impact on the complexity of the multiplier. The complexity of this proposed method has very meager reduction in the area which is merely 4.28%. It is found that the adder has a quantum cost of 8 which makes it to be easily implementable in the latest CMOS designs with very low power dissipation and heat generation. It is significant that the modified Wallace multipliers third phase could be designed with a hybrid adder of CLA and RCA. This could reduce the
complexity and maintains the delay of the multiplier than the conventional RCA or CLA adders as the final adders.
REFERENCE

Ron S. Waters, and Earl E. Swartzlander, ( 2010 ) A Reduced Complexity Wallace Multiplier Reduction, IEEE Transaction on computers, Vol. 59, no.8, pp 11341137.

Wallace.C.S., (1964 ), A Suggestion for a Fast Multiplier, IEEE Trans. Electronic Computers, Vol. 13, no. 1, pp. 1417.

Sudhir Dakey (2015) Design And Analysis Of A Full Adder using Various Reversible Gates , International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 08

H.R.Bhagyalakshmi.(2010), An Improved Design Of A Multiplier Using Reversible Logic Gates, International Journal of Engineering Science and Technology Vol. 2(8), 2010, 38383845

Shahebaj Khan, Sandeep Kakde, Yogesh Suryawanshi.(2013) , VLSI Implementation of Reduced Complexity Wallace Multiplier Using Energy Efficient CMOS Full Adder, IEEE International Conference on Computational Intelligence and Computing Research

Abutaleb, M. M. (2018). Robust and efficient QCA cellbased nanostructures of elementary reversible logic gates. The Journal of Supercomputing, 74(11), 62586274.

Moharrami, E., &Navimipour, N. J. (2018). Designing nanoscale counter using reversible gate based on quantumdot cellular automata. International Journal of Theoretical Physics, 57(4), 1060 1081.

Montaser, R., Younes, A., & AbdelAty, M. (2019). New Design of Reversible Full Adder/Subtractor using R gate. International Journal of Theoretical Physics, 58(1), 167183.

V.G.Oklobdzija, D Villeger, S.S Liu, A method for speed optimized partial product reduction and generation of fast parallel multipliers using an algorithmic approach, IEEE Transactions on Computers,
Vol.45, pp.294 306, 1996

Padma Devi, Ashima Girdher and Balwinder Singh, Improved Carry Select Adder with Reduced Area and Low Power Consumption, International Journal of Computer Application, Vol 3.No.4, June 2010 .

Deepa Sinha, Tripti Sharma, k.G.Sharma, Prof.B.P.Singh, Design and Analysis of low Power 1bit Full Adder Cell,IEEE, 2011.

Shahzad Asif; Yinan Kong (2015), Analysis of different architectures of counter based Wallace multipliers, Tenth International Conference on Computer Engineering & Systems (ICCES) Pages: 139 144

Liangyu Qian; Chenghua Wang; Weiqiang Liu; Fabrizio Lombardi; Jie Han(2016), Design and evaluation of an approximate Wallace Booth multiplier, IEEE International Symposium on Circuits and Systems (ISCAS), Pages: 1974 – 1977

Sandeep Kakde; Shahebaj Khan; Pravin Dakhole; Shailendra Badwaik (2015), Design of area and power aware reduced Complexity Wallace Treemultiplier, International Conference on Pervasive Computing (ICPC) Pages: 1 6

Damarla Paradhasaradhi; M. Prashanthi; N. Vivek (2014 ), Modified wallace tree multiplier using efficient square root carry select adder, International Conference on Green Computing Communication and Electrical Engineering, Pages: 1 5