# Transistor Level Design And Analysis Of Vedic Algorithm Based Low Power MAC

DOI : 10.17577/IJERTV2IS2550

Text Only Version

#### Transistor Level Design And Analysis Of Vedic Algorithm Based Low Power MAC

1K.Ramakrishnan, 2T.Ravi, 3V.Kannan

1 M.Tech-VLSI design, Sathyabama University, Jeppiaar Nagar, Rajiv Gandhi Salai, Chennai 119

2 Assistant Professor, Sathyabama University, Jeppiaar Nagar, Rajiv Gandhi Salai, Chennai 119.

3 Principal, Jeppiaar Institute of Technology, Kunnam, Tamilnadu, India.

Abstract Multiplier and accumulator find extensive application in digital signal processing and advanced processor. As the speed of the processor increases, the speed of the data path elements (multipliers, Adder, subtractor) has to be increased to perform a high speed operation. This requirement leads the development of high speed and low power MAC unit. In this paper we have analyzed a Multiplier and accumulator unit based on "Urdhva tiryagbhyam sutra". The analysis has been carried out in transistor level with comparison of power levels. HSPICE software has been used to analyze the MAC unit. The transistor level analysis has been performed to evaluate the reduction of transistor and power level. A 130nm mosfet model has been used to simulate the power level analysis.

Key Words Multipliers, Vedic multipliers, Multipliers and accumulators, Transistor analysis.

1. INTRODUCTION

Multiplication is one of the important arithmetic instructions of a processor. Multiplication instruction requires two sets of numbers. The two sets of numbers are called multiplicand and multiplier. The output of the multiplier is called product of the two numbers. The general method of multiplication involves multiplication of each number and generating partial products with the adding of partial products results in the product (result). The binary multiplication always has a product and a carry. Processor performs multiplication in binary levels which will be little complicated when compared to decimal multiplication. The previous generation processor has an arithmetic instruction called MUL to multiply two different numbers. The speed and the bit level handling of the processor increases with the increase in the transistor in a die. The increase in the speed and the bit level capability of the processor requires a high speed and low power multiplication algorithm. Ancient Vedic sutras are best known for easier multiplication and using such techniques we can improve the speed and efficiency of the multiplier and thereby we can have an efficient MAC unit for future generation processor.

In this paper, we have performed a detailed transistor level analysis on MAC unit. Section II describes about the analysis

of XOR gate .Section III describes about the analysis of Vedic Multiplier. Section IV describes about the analysis of 8 bit MAC. Section V concludes our analysis work. Section VI provides the reference list which has been used as reference for our analysis work.

2. PERFORMANCE ANALYSIS OF XOR GATE

The multiplication algorithm requires frequent addition of two or more binary numbers. The Addition of binary numbers can be performed using XOR gate. The basic adder of an electronic circuit consists of a XOR gate and AND gate to perform sum and carry operation. Our first analysis is to reduce the number of gates required to perform SUM operation in an ADDER circuit. A typical XOR gate requires 22 transistors to perform XOR operation. In our analysis we have used 12T XOR gate and 6T XOR gate to perform SUM operation.

 A B 0 0 0 0 1 1 1 0 1 1 1 0
 A B 0 0 0 0 1 1 1 0 1 1 1 0

The basic difference between these 22T, 12T and 6T XOR gate are difference in number of transistors used in the design and the power level consumed by each design to perform a single XOR operation. The basic truth table of XOR gate I shown in the table. [12]

Table 1 XOR gate Truth table

22T CONVENTIONAL XOR GATE

The conventional XOR gate requires two AND gates, two inverters and one OR gate. Total Number of gates for A XOR B operation requires 22 transistors. The number of transistor required for each gate operation is given below.

1. AND Gate – 6 T (3 N-Transistor & 3 P-Transistor)

2. OR Gate – 6 T (3 N-Transistor & 3 P-Transistor)

3. Inverter – 2 T (2 N-Transistor & 2 P-Transistor)

The below figure depicts the transistor level schematic diagram of 22T XOR gate.

Figure 1 22T XOR gate

12 TRANSISTOR XOR GATE

The below figure depicts the transistor level schematic diagram of 12T XOR gate.

Figure 2 12T XOR gate

6 TRANSISTOR XOR GATE

The below figure depicts the transistor level circuit diagram of 6T XOR gate. This circuit is based on transmission gate architecture. The transistor M5 and M6 forms a transmission gate which provides valid threshold levels for the output (Y).

Figure 3 6T XOR gate

We have analyzed 3 types of XOR design to construct a multiplier unit. An efficient Multiplier has been designed based on the low average power consumption and reduced number of transistor.

3. VEDIC MULTIPLIER USING URDHVA TIRYABHYAM SUTRA

In this paper we have proposed a transistor level analysis of Urdhva Tiryagbhyam sutra based vedic multiplier. The operation of the Urdhva Tiryagbhyam can be applied to binary multiplication. It is also called as Vertical and Crosswise multiplication method. The below figure depicts the line diagram of Urdhva Tiryagbhyam multiplication method.

Figure 4 LINE diagram of Urdhva Tiryagbhyam Sutra Multiplication [16]

In the above figure 0 represents a binary number. It can be of either logic1 or logic 0. The above algorithm has been used to design a 4bit Urdhva Tiryagbhyam multiplier. The following diagram shows the circuit diagram of 4bit vedic multiplier.

DESIGN OF 4-BIT VEDIC MULTIPLIER

In the below diagram the 4 bit adder has been designed using K-map Logic Minimizer software. The software has been used to optimize our 4 bit adder design using XOR gate. The goal is to use optimized XOR gate so that we can reduce the transistor and power consumption. The four bit adder consists of 4 binary inputs of 1bit. The outputs of the 4 bit binary adder consist of sum, carry0 and carry1.

The XOR optimized version of 4 bit adder expression is as follows. [5]

• Sum = A xor B xor C xor D

• C0 = B(A xor C) + D (A xor B) + C (A xor D)

• C1 = ABCD

Figure 5 Block diagram of 4 bit Urdhva Tiryagbhyam Multiplier [16]

The final result will be R7,R6,R5,R4,R3,R2,R1,R0 for two 4 bit binary numbers. The conventional half adder and full adder were used but with the optimized XOR gate. In the above diagram R0 will be the LSB and R7 will be the MSB of the multiplier output.

DESIGN OF 8-BIT VEDIC MULTIPLIER

An 8-bit multiplier has been designed using four 4bit multiplier. The architecture of 8 bit multiplier consists of 4 bit multiplier and a ripple carry adder. The ripple carry adder has been used to add the partial outputs of 4 bit multiplier. The multiplican is of 8bit and the result is of 16bit binary numbers. The 4 bit multiplier used in our design is based on Urdhva Tiryagbhyam sutra. The below figure depicts the architecture of 8 bit Vedic multiplier.

Figure 6 Block Diagram of 8 Bit Urdhva Tiryagbhyam Multiplier

4. 8 BIT MAC USING URDHVA TIRYABHYAM SUTRA VEDIC MULTIPLIER

MAC is a datapath element which is used to multiply and accumulate the binary number. MAC units are widely used in DSP processor as signal processing element. In our design an 8bit MAC unit has been designed using Vedic multiplier with different XOR gate design. MAC unit consist of an 8 bit multiplier, ripple carry adder and PIPO shift register. The ripple carry adder has been used to add the present output and the previously generated MAC output. The PIPO register is used to shift the output of MAC unit as second operand for the adder for the next cycle of operation. Initially the PIPO register output has been designed to produce logic 0. This is accomplished by enabling the CLR line of the D flip flop. The maximum time required to hold the CLR signal depends on the speed of the operation of the multiplier. The time delay of the multiplier and the PIPO delay are calculated to perform correct operation of the multiplier and accumulator unit. The clock signal for the PIPO register has to be in sync with the multiplier output. The PIPO output and multiplier output should arrive at the same time in order to add without any error. The below figure shows the architecture of a 8-BIT VEDIC MAC unit

Figure 7 Block Diagram of MAC

5. RESULTS AND DISCUSSION

A detailed analysis was carried out in HSPICE to determine the power consumption of the three different XOR gate design. A 130nm CMOS technology was used to simulate the analysis. The rationale for selecting 12T and 6T XOR design is to reduce the area by one half so that overall area will be reduced by considerable level.

 S# No of Transistor used in XOR Gate Power consumption in watts 1 22T 6.538e-09 2 12T 8.319e-10 3 6T 7.222e-08

Table 2-XOR gate power comparison

It has been observed from the above table that there is a significant power reduction in 12T transistor design. The average power of 6T design was increased when compared to 12T and 22T transistor. Hence a designer should consider a tradeoff between the number of transistor and the power consumption.

The below table depicts the power consumption of 4 bit vedic multiplier based on Urdhava Tiryagbhyam sutra

 Function 12T XOR gate 4-bit vedic multiplier 6T XOR gate 4-bit vedic multiplier Power consumption in watts 5.225e-08 3.561e-06 No of Transistor 744 546

Table 3 4Bit Vedic Multiplier power comparison chart

The below table depicts the power consumption of 8 bit vedic multiplier based on Urdhava Tiryagbhyam sutra. It has been observed that 12T transistor shows considerable reduction in power when compared with 6T XOR design.

 Function 22T XOR gate 8-bit vedic multiplier 12T XOR gate 8-bit vedic multiplier 6T XOR gate 8-bit vedic multiplier Power consumption in watts 6.369e-07 2.983e-07 2.012e-05 No of Transistor 6280 4320 3144

Table 4 8Bit Vedic Multiplier power comparison chart

The below table shows the power comparison and the transistor used to design each MAC unit. It has been observed that we have reduced significant power consumption and the area of the MAC unit.

 Function 22T XOR gate 8-bit MAC 12T XOR gate 8-bit MAC 6T XOR gate 8-bit MAC Power consumption in watts 6.584e-06 5.882e-06 7.600e-05 No of Transistor 9220 6660 5124

Table 5 8 Bit MAC power comparison chart

TRANSIENT ANALYSIS

22T- XOR GATE SIMULATION WAVEFORM

Figure 8 – 22T XOR gate simulation result

12T- XOR GATE SIMULATION WAVEFORM

Figure 9 – 12T XOR gate simulation result

6T- XOR GATE SIMULATION WAVEFORM

Figure 10 – 6T XOR gate simulation result

6T XOR GATE 4 – BIT VEDIC MULTIPLIER WAVEFORM

Figure 11 – 6T XOR gate 4-BIT Vedic Multiplier simulation result

In the above figure the terminal 23,22,21,20 and 33,32,31,30 are the input binary numbers for the 4 bit multipliers and 41, 61, 63, 65, 67, 69, 71, 72 corresponds to R0, R1, R2, R3, R4, R5, R6, and R7. The same is applicable for the 12T -4 BIT vedic multiplier. (Refer Fig: 5 )

12T XOR GATE 4 – BIT VEDIC MULTIPLIER WAVEFORM

Figure 12 12T XOR gate 4-BIT Vedic Multiplier simulation result

6T XOR GATE 8 – BIT MAC WAVEFORM

Figure 13 6T XOR gate 8-Bit MAC Simulation Result

12T XOR GATE 8- BIT MAC WAVEFORM

Figure 14 12T XOR gate 8-Bit MAC Simulation Result

(Partial output shown for higher clarity)

CONCLUSION

As per our extensive analysis, it has been observed that there is always a tradeoff between power and area. The smaller area will have higher power dissipation. In our analysis it has been observed that, there is a significant reduction in number of transistors compared to the conventional design. The power consumption of low transistor design has been decreased significantly when compared to the conventional design. The 12 transistor XOR gate design proves to be higher efficient design and consumes lower power and smaller area. The future scope of implementation includes efficient optimization to achieve higher speed and lesser delay and area. By significant improvements in semiconductor technology, we can achieve lesser area and higher speed for the optimized architecture. All our analysis is carried out in HSPICE 130nm technology.

Further it has been observed that all the analysis which has been carried out earlier in any of the references mentioned below are based on VHDL /Verilog based simulation and we have done a transistor level HSPICE simulation to find out the exact number of transistor required and the power consumption of the MAC unit.

6. REFERENCES

1. Sumit Vaidya and Deepak Dandekar, Delay- Power Performance Comparison of Multipliers in VLSI Circuit Design International Journal of Computer Networks & Communications (IJCNC) Vol.2 / No.4 / July 2010.

2. Manoranjan Pradhan, Rutuparna Panda, Sushanta Kumar Sahu, MAC Implementation using Vedic Multiplication Algorithm, International Journal of Computer Applications Volume 21 No.7, May 2011.

3. G.Ganesh Kumar, V.Charishma,Design of High Speed Vedic multiplier using Vedic Mathematics Techniques, International Journal of Scientific and Research Publications, Volume 2 / Issue 3 / March 2012 / ISSN 2250-3153.

4. Sree Nivas A and Kayalvizhi N Implementation of Power Efficient Vedic Multiplier, International Journal of Computer Applications Volume 43 / No.16 / April 2012.

5. Krishnaveni D, and Umarani T.G VLSI Implementation of Vedic Multiplier with Reduced Delay, International Journal of Advanced Technology & Engineering Research (IJATER), ISSN No: 2250-3536 / Volume 2 / Issue 4 / July 2012.

6. Arun K Patro & Kunal N Dekater, A Transistor Level Analysis for an 8-Bit Vedic Multiplier, International Journal of Electronics Signals and Systems, ISS: 2231- 5969 / Vol-1 Issue-3 / 2012.

7. M. Ramalatha , K. D. Dayalan , P. Dharani , and S. D Priya, High Speed Energy Efficient ALU Design using Vedic Multiplication Technique , Lebanon , pp. 600-603, July 2009.

8. P. Mehta, and D. Gawali, Conventional versus Vedic Mathematical Method for Hardware Implementation of a Multiplier, International Conf. on Advances in Computing, Control, and Telecommunication Technologies, Trivandrum, Kerala, India, pp. 640-642, 2009.

9. Jagadguru Swami Sri Bharati Krisna Tirthaji Maharaja, Vedic mathematics, Motilal Banarsidass Publishers Pvt. Ltd, Delhi, 2009.

10. Pouya Asadi and Keivan Navi, A New Low Power 32Ã—32- bit Multiplier World Applied Sciences Journal (4):341:347, 2007, IDOSI Publication.

11. C.Senthilpari, Ajay Kumar Singh and K. Diwadkar, Low power and high speed 8×8 bit Multiplier Using Non- clocked Pass Transistor Logic 1-4244-1355-9/07, 2007, IEEE.

12. Morris Mano, Computer System Architecture,PP. 346- 347, 3rd edition,PHI. 1993.

13. Jong Duk Lee, Yong Jin Yoony, Kyong Hwa Leez and Byung-Gook Park Application of Dynamic Pass Transistor Logic to 8-Bit Multiplier Journal of the Korean Physical Society, Vol. 38, No. 3, pp.220-223,March 2001.

14. Kihak Shin, Ik Kyun Oh, Sang Min, Beom Seom Ryu,Kie Young Lee and Tae Won Cho A Multi-Level Approach to Low Power Mac Design IEEE Trans.VLSI systems, vol 48 , pp 361- 763, 1999.

15. Hamdi Belgacem, Khedhiri Chiraz, and Tourki Rached, Pass Transistor Based Self-Checking Full Adder International Journal of Computer Theory and Engineering, Vol. 3, No. 5, October 2011.

16. Harpreet Singh Dhillon and Abhijit Mitra , A Reduced- Bit Multiplication Algorithm for Digital Arithmetic International Journal of Computational and Mathematical Sciences 2:2 2008.