- Open Access
- Total Downloads : 888
- Authors : M.Rakesh , D.Pitchaiah
- Paper ID : IJERTV1IS8051
- Volume & Issue : Volume 01, Issue 08 (October 2012)
- Published (First Online): 29-10-2012
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Glitche Reduction in Low-Power Low-Frequency Multipliers
M.RAKESH#1, D.PITCHAIAH *2
#1 M.Tech .,DECS,Department of ECE ,Dr.S.G.I.E.T,Markapur, Prakasam Dt, AP, India
*2 Associate Professor, Department of ECE , Dr.S.G.I.E.T,Markapur, Prakasam Dt, AP, India
Abstract- Various 16-bit multiplier architectures are compared in terms of dissipated energy, propagation delay, energy-delay product (EDP), and area occupation, in view of low-power low-voltage signal processing for low-frequency applications.A novel practical approach has been set up to investigate and graphically represent the mechanisms of glitch generation and propagation. It is found that spurious activity is a major cause of energy dissipation in multipliers. Measurements point out that, because of its shorter full- adder chains, the Wallace multiplier dissipates less energy than other traditional array multipliers(8.2 µW/MHz versus
-
µW/MHz for 0.18µm CMOS technology at 0.75 V). The benefits of transistor sizing are also evaluated (Wallace including minimum-size transistors dissipates 6.2 W/MHz). By combining transmission gates with static CMOS in a Wallace architecture, a new approach is proposed to improve the energy-efficiency further (4.7µ W/MHz), beyond recently published low-power architectures. The innovation consists in suppressing glitches via resistancecapacitance low-pass filtering, while preserving unaltered driving capabilities. The reduced number of Vdd-to-ground paths also contributes to a significant decrease of static consumption.
Index TermsArithmetic, glitch, low frequency, low power, multiplier, switching activity, transmission gate.
-
INTRODUCTION
Most digital signal processor (DSP) systems incorporate a multiplication unit to implement algorithms such as Convolution and Filtering. In many DSP algorithms, the Multiplier in the critical path and ultimately determines the performance of the algorithm. However, the demand for High- Performance portable systems incorporating multimedia capabilities has elevated the design for Low-Power to the forefront of design requirement in order to maintain reliability and provide longer hours of operation Multipliers are on the critical path of many computational applications .Examples are Real-time Digital Signal Processing, Floating Point applications, or Computers. Designing Low-Power fast Multipliers has been a great theoretical and practical interest for computer scientists and engineers. Several algorithms and VLSI implementations have been proposed and practically used. The proposed High-Speed multiplication algorithm which postpones the carry-propagation to the last stage where two 2(n-1)-bit numbers are added using a fast carry-look- ahead adder (CLA).
Depending on the application, one of the parameters like speed, power consumption, or area might be of great
priority. Based on this criterion, the designer may decide to design the Multiplier.
Different Low Power Techniques:
-
Power Gating
Power Gating is effective for reducing leakage power. Power gating is the technique wherein circuit blocks that are not in use are temporarily turned off to reduce the overall leakage power of the chip. This temporary shutdown time can also call as "low power mode" or "inactive mode". When circuit blocks are required for operation once again they are activated to "active mode". These two modes are switched at the appropriate time and in the suitable manner to maximize power performance while minimizing impact to performance.
-
Multiple Threshold CMOS (MTCMOS) Circuits
MTCMOS logic is effective standby leakage control technique, but difficult to implement since sleep transistor sizing is highly dependent on discharge pattern within the circuit block. They showed dual Vt domino logic avoids the sizing difficulties and inherent performance associated with MTCMOS. High Vt cells are used where leakage has to be prevented whereas low Vt cells are employed where speed is of concern. Both cells are effectively used in MTCMOS technique. In active mode of operation the high Vt transistors are turned off and the logic gates consisting of low Vt transistors can operate with low switching power dissipation and smaller propagation delay
Fig 1.1 MTCMOS technique
-
Multi Threshold (MVT) Voltage Technique
Multiple threshold voltage techniques use both Low Vt and High Vt cells. Use lower threshold gates on critical path while higher threshold gates off the critical path. This methodology improves performance without an increase in power. Flip side of this technique is that Multi Vt cells increase fabrication complexity.
-
Multi Vdd (Voltage)
Dynamic power is directly proportional to power supply. Hence naturally reducing power significantly improves the power performance. At the same time gate delay increases due to the decreased threshold voltage. High voltage can be applied to the timing critical path and rest of the chip runs in lower voltage. Overall system performance is maintained. Different blocks having different voltage supplies can be integrated in SoC.
Multiple Voltage ASIC/SoC Design: Classification (a). Static Voltage Scaling (SVS)
(b). Multi-level Voltage Scaling (MVS)
(C). Dynamic Voltage and Frequency Scaling (DVFS) (d). Adaptive voltage Scaling (AVS)
-
-
Low Power Design Of Techniques
Types of Multipliers:
Multipliers are categorized relative to their applications, architecture and the way the partial products are produced and summed up. Based on all these, a designer might find following types of multipliers.
Array Multiplier
Array Multiplier is an efficient layout of a combination multiplier. It accepts all bits simultaneously. The longest product calculation delay in it depends on the speed of the adders. An n-bit multiplier requires n (n-1) full adders and n2 AND gates It is possible to decompose Array multipliers in two parts. The first part is dedicated to the generation of partial products, and the second one collects and adds them. The collection of the partial products is made using a regular array.
Serial/parallel multiplier
In a serial/parallel multiplier, the multiplicand x arrives bit- serially while the multiplier a is applied in a bit-parallel format. A common approach used in such multipliers is to generate a row or diagonal of bit-products in each time slot and perform the additions concurrently. Suppose the data is positive x>0. Using carry save adders, shift and add algorithm can be applied as shown in Figure
2.11 Since x is processed bit-serially and co-efficient a is processed bit-parallel, this type of multiplier is called a serial/parallel multiplier.
Figure 2.1: Serial/parallel multiplier
Transposed serial/parallel multiplier
This is an alternative form of serial/parallel multiplier, which adds the bit products column wise as shown in Figure 2.12. The disadvantage of this multiplier is a is a long sum-propagation path.
Figure 2.2: Transposed serial/parallel multiplier
This disadvantage can be alleviated by pipelining, at a cost of two D flip flops per stage. This multiplier structure can be modified into a serial/parallel or squarer where both the multiplier and the multiplicand arrive bit-serially.
-
Low Power Design Of 1-Bit Full Adder
-
Most often, Full adder is a part of the critical path that determines the overall performance of a system.1-bit full adder is one of the most critical comonents of a processor that determines its throughput.In this project a new 1-bit 10-transistor full adder is proposed which consumes less power than the standard implementations of full adder cell. The proposed adder is tested and compared with the high transistor count and existing 10-transistor adders under the same conditions. The addition of 2 bits A and B with C yields a SUM and a CARRY bit. The integer equivalent of this relation is shown as
SUM= (A B).C+ (A B).C (1)
CARRY= (A B).C + (A B). A (2)
Proposed 1-bit full adder
The circuit diagram of the new 1-bit full adder is shown in figure 4.1 and its layout in figure 4.2.The proposed adder implements equations(1) and (2) using complementary CMOS and MUX based design logic with only 10-transistors.The adder is useful in larger circuits such as multipliers despite the threshold problem. The number of direct connections from VDD to the ground is reduced in the new design to minimize the power consumption due to short circuit current. Also the generation of SUIM from CARRY is avoided as in the CMOS adder. The adder uses internally generated signal (A XOR B) and (A XNOR B) to control the output transistor gates. The same (W/L=5 / 2) ratio is used for all the designs and our design is compared on the same platform in 70nm technology in MICROWIND.The SUM and CARRY signals are generated separately after the generation of (A XOR B) so as to reduce the delay. The SUM and CARRY waveforms are shown in figure 4.2.In the design, the second CMOS inverter in the critical path of the generation of the SUM helps in reducing the threshold loss. The new adder works well at frequencies up to 2GHZ with low supply voltages in 70nm CMOS technology. Performance analysis of all the adder designs is carried out in 70nm CMOS technology in
MICROWIND.The performance is studied at power supply voltage of 0.7V at frequencies of 50MHZ.
IV IMPLEMENTATION OF LOW POWER MULTIPLIERS
-
INTRODUCTION
Arithmetic circuits, like adders and multipliers, are essential components in the design of communication circuits in ASIC. Recently, an overwhelming interest has been seen in the problems of designing digital systems for communication systems and digital signal processing with low power at no performance penalty. To design low power high-speed arithmetic circuits requires a combination techniques at four levels; algorithm, architecture, circuit and system levels. This thesis presents an ASIC implementation of a multiplication algorithm, which is suitable for high-performance and lowpower applications.
In the past, multiplication was implemented generally with a sequence of addition, subtraction and shift operations. Recently, many multiplication algorithms have been invented and developed, each having pros and cons in different fields.
The multiplier is a fairly large block of a computing system. The amount of circuitry involved is proportional to the square of its resolution; i.e. a multiplier of size n bits has O(n2)gates. For multiplication algorithms performed in DSP applications, latency and throughput are the two major constraints from delay perspective. Latency is the real delay of computing a function, a measure of how long after the inputs to a device are stable, is the final result available on outputs. Thats why, if one also aims to minimize power consumption, it is of great interest to identify the techniques to be applied to reduce delay by using various delay optimizations.
Digital multiplication is a series of bit shifts and bit additions, where two numbers, the multiplicand and the multiplier are combined into the result. Considering the bit representations of the multiplicand X, X n-1, X1, X0 and the multiplier Y,Y n-1 Y1 Y 0 in order to form the product, up to n shifted copies of the multiplicand is to be added for unsigned multiplication. The entire process consists of three steps.
-
partial product generation
-
partial product reduction and
-
Final addition
-
-
Baugh-Wooley Multiplier
The Baugh-Wooley technique was developed to design direct multipliers for twos complement numbers. When multiplying 2s complement numbers directly, each of the partial products to be added is a signed number. Thus, each partial product has to be sign-extended to the width of the final product in order to form the correct sum by the CSA tree. According to the Baugh-Wooley approach, an efficient method of adding extra entries to the bit matrix is suggested to avoid having to deal with the negatively weighted bits in the partial
product matrix.
-
Components of Power Consumption
Power consumption in a static CMOS circuit basically comprises three Components: dynamic switching power, short circuit power and static power. Compared to the other two components, short circuit power normally can be ignored in submicron technology. Dynamic Power:
Dynamic power is due to charging and discharging the loading capacitances. It can be expressed by the following equation:
4.2 THEORY OF MULTIPLICATION ALGORITHMS
Pdyn
1
dd
= CL V
2
2 . A .F 4.1
Multiplier structure
The operation of multiplication is rather simple in digital electronics. It has its origin from the classical algorithm for the product of two binary numbers. This algorithm uses addition and shift left operations to calculate the product of two numbers. Two examples are presented below.
Where CL is the loading capacitances, including the gate capacitance of the driven gate, the Diffusion capacitance of the driving gate and the wire capacitance; Vdd is the power supply voltage; A is the switching activity; F is the circuit operating frequency.
-
Techniques for Dynamic Power Reduction
Dynamic power is comprised of logic switching power and
glitch power, and can be expressed by the following equation.
1
L
P = C V
2 . A .F 4.2
2
dyn dd
The left example shows the multiplication procedure of two unsigned binary digits while the one on the right is for signed multiplication. The first digit is called Multiplicand and the second Multiplier. The only difference between signed and unsigned multiplication is that we have to extend the sign bit in the case of signed one, as depicted in the given right example in PP row 3. Based upon the above procedure, we can deduce an algorithm for any kind of multiplication which is shown in Figure 4.3. Here, we assume that the MSB represents the sign of digit.
To reduce dynamic power at a specified operating frequency F, we can either reduce the dynamic power consumption per logic transition which is determined by loading Capacitances CL, and power supply Vdd, or reduce the number of logic transitions in the circuit represented by switching activity A.
-
Logic switching power reduction Dual power supply:
Reducing the supply voltage, or voltage scaling, is the most effective technique for dynamic power reduction because dynamic power is proportional to the square of the power supply. Similar to the dual-Vth approach, the dual Vdd technique assigns high Vdd to all the gates on the critical paths and low Vdd to some of the gates on the
non-critical paths. When a gate operating at a lower Vdd directly drives a higher Vdd gate, a level converter is required to avoid the undesirable short circuit power in that higher Vdd gate due to the possible large DC current caused by the low voltage fanin. Since the level converters contribute additional power, minimizing the number of level converters is also important in voltage scaling.
-
Gate sizing
Non-critical paths have timing slack and the delays of some gates on these paths can be increased without affecting the performance. Since the lengths of devices (transistors) in a gate are usually minimal for a high speed application, the gate delay can be increased by reducingthe device width.
As a result, the dynamic power is accordingly decreased due to smaller loading capacitance CL, which is proportional to the device size.
Gate sizing is a technique that determines device widths for gates. Traditional gate sizing approaches use Elmore delay models in a polynomial formulation. Heuristics based greedy approaches can be
used to solve such a polynomial problem.
In submicron technology, the reverse-biased PN junction leakage is much smaller than sub threshold and gate leakage and hence can be ignored. The sub threshold leakage is the weak inversion current between source and drain of an MOS transistor when the gate voltage is less than the threshold voltage. It is given by:
C W Vgs Vth
Vds
d = g d + C
OUTI
..4.3
Isub = µ0 COX
V 2 e1.8 exp 1 exp
T
VT
.
i i i
GSI
Leff
nVT
C outi
C wireij C.GSJ .4.4 jFOi
Where 0 is the zero bias electron mobility, Cox is the oxide capacitance per unit area, n is the sub threshold slope coefficient, Vgs and Vds are the gate-to-source voltage and drain-to-source voltage,
-
Transistor sizing
The basic idea of transistor sizing is exactly the same as that of gate sizing except that in gate sizing all the transistors in one gate are sized together with the same factor but in transistor sizing each transistor can be sized independently.
For a gate on a critical path, only part of its transistors contribute the largest intrinsic gate delay, so the remaining transistors still can be sized to reduce the Capacitances. In gate sizing, gdi, the intrinsic gate delay of gate i in Equation (5.3) and (5.5) is a fixed value which makes it impossible to differentiate among the internal IO paths. On the contrary, transistor sizing explores the maximum possible optimization space by sizing transistors independently.
-
Leakage Power
In the past, the dynamic power dominated the total power dissipation of a CMOS device. Since dynamic power is proportional to the square of the power supply voltage, lowering the voltage reduces the power dissipation. However, to maintain or increase the performance of a circuit, its threshold voltage should be decreased by the same factor, which causes the sub threshold leakage current of transistors to increase exponentially and make it a major contributor to power consumption.
To reduce leakage power, many techniques have been proposed, including transistor sizing , multi-Vth , dual-Vth , optimal standby input vector selection , transistor stacking , body bias, etc.As the threshold voltage (Vth) of transistors in a CMOS logic gate is increased, the leakage current is reduced but the gate slows down. Dual-Vth assignment is an efficient technique for leakage reduction.
-
Leakage Current
The leakage current of a transistor is mainly the result of reverse-biased PN junction leakage, sub threshold leakage and gate leakage as illustrated in Figure 5.3
respectively, VT is the thermal voltage, Vth is the threshold voltage, W is the channel width and Leff is the effective channel length, respectively. Due to the exponential relation between Isub and Vth, an increase in Vth sharply reduces the sub threshold current.
-
Techniques for Leakage Reduction
-
Leakage is becoming comparable to dynamic switching power with the continuous scaling down of CMOS technology. To reduce leakage power, many techniques have been proposed, including dual-Vth, multi-Vth, optimal standby input vector selection, transistor stacking, and body bias.
4.5.6.1 Dual-Vth Assignment
Dual-Vth assignment is an efficient technique for leakage reduction. In this method,each cell in the standard cell library has two versions, low Vth and high Vth. Gates with low Vth are fast but have high subthreshold leakage, whereas gates with high Vth are slower but have much reduced subthreshold leakage. Traditional deterministic approaches for dual-threshold assignment utilize the timing slack of non-critical paths to assign high Vth to some or all gates on those non-critical paths to minimize the leakage power.
49.6.2 Multi-Threshold-Voltage CMOS (MTCMOS)
To reduce the area, power and speed overhead contributed by the sleep control high-Vth transistors, only one high-Vth transistor is needed. Figure 5.5(b) and 5.5(c) show the PMOS insertion MTCMOS and NMOS insertion MTCMOS. NMOS insertion MTCMOS is preferred because for any given size, an NMOS transistor has smaller on-resistance than a PMOS transistor. Compared to the dual-Vth technique, MTMOS can only reduce leakage in the standby mode and has additional area-, power-, and speed overheads.
-
Transmission Gates
A CMOS transmission gate is created by connecting an nFET and pFET in Parallel as shown in fig 5.8(a).The nFET Mn is
controlled by the signal s, while the pFET Mp is controlled by the complement s.When wired in this manner, the pair acts as a good electrical switch between the input and the output variables x and y respectively.The operation of the switch can be understood by
analyzing the two cases for s.if s=0, the nFET is OFF; since s =1, the pFET is also OFF, so that the TG acts as an open switch. In this case, there is no relationship between x and y.For the opposite case
where s=1 and s =0,both FETS are on, and the TG provides a good conducting path between x and y.Logically,this is identical to the switching of an nFET so that we may write
-
Compressors
Compressors are mostly used in multipliers to reduce the operands while adding terms of partial products. A compressor Ci is a combinatorial device that compresses N input lines in the position i to 2 output lines i.e. sum and carry. In addition, there are L inputs lines coming to the compressor to different levels j.
4.7.1 More compression
Based on the previous discussion, further complex compressors can be built by using basic compressors like [3:2] and [4:2] compressors. For example, a [6:2] compressor can be built using two [3:2] and one [4:2] compressors.
-
Multiplier Power Reduction
The design of digital CMOS has focused on delay reduction and power dissipation. In multipliers, delay increases as the size of the multiplier grows in terms of bits, but it can vary depending on the implementation. Power is proportional to the amount of circuitry of the multiplier and the way that it is connected to perform the multiplication. Since the amount of adder blocks is proportional to the square of the size of the number of bits ( n2 ), multipliers tend to be fairly large, power consuming blocks.
Dynamic power consumption of digital CMOS circuits is expressed by Eq.(5.17). Static power consumption is neglected because which is relatively too small., only one device is conducting at a time. So, theres no need to
calculate static power; only dynamic power
exists since there is never a direct path between VDD and GND in steady state.
the number of nodes and alpha is the number of switching activities. An equivalent equation can be expressed as
In this equation, Ipeak determined by the saturation current of the pmos and nmos transistors, which depend on their sizes, process technology, temperature, etc. and the ratio between input and output slopes. When load capacitance is small, power is dominated by Isc, short circuit current. Isc is less than 10% of total dynamic current under the condition of fast rising time and falling time. Therefore short circuit current is neglected for convenience of calculation. Because supply voltage and operation frequency are fixed when the application is specified, the power consumption is determined by node capacitance and transition activities (probability).
-
Results and Discussions
-
CELL-1 STRUCTURE
The 4X4 bit Array Multiplier has two different cells which are named as cell-1 and cell-2. The implementation of cell-1 consists of conventional 36 transistor full adder as shown in figure 5.1. The output wave form and the layout for the cell-1 are given in figures 5.2 and 5.3 respectively.
Figure 5.1 Schematic view of the designed cell-1 structure
Figure 5.2 Simulated input and output waveforms of the cell1
Figure 5.3 Layout view of the cell 1
-
CELL-2 STRUCTURE
The second type of cell named as cell-2 consists of 10-Transistor full adder cells and MTCMOS cells. The full adders used in cell-2 can be designed by using 10-Transistors and the design and the symbol for the corresponding design is presented in section 5.1. The schematic view of the cell-2 structure is as shown in figure 6.4. The output wave form and the layout for the cell-2 are given in figures 6.6 and 6.7 respectively.
Figure 5.4 Schematic view of the designed cell-2 structure
Figure 5.6 Simulated input and output waveforms of the cell-2
Figure 5.7 Layout view of the cell -2
-
DESIGN OF 4X4 ARRAY MULTIPLIER BLOCK
-
5.3.1 4X4 ARRAY MULTIPLIER USING CELL-1 AND CELL- 2.
Figure 5.8 Schematic view of the designed 4X4 Array Multiplier
Figure 5.9 Simulated input and output waveforms of 4X4 Array Multiplier
Figure 5.10 Layout view of the 4X4 Array Multiplier
5.4DESIGN OF BRAUN MULTIPLIER BLOCK
5.4.1 BRAUN MULTIPLIER USING CELL-1 AND CELL-2
Figure 5.11 Schematic view of the designed Braun Multiplier
Figure 6.12 Simulated input and output waveforms of Braun Multiplier
5.5 DESIGN OF BAUGH WOOLEY MULTIPLIER BLOCK
5.5.1 BAUGH WOOLEY MULTIPLIER USING CELL-1 AND CELL-2.
Figure 5.13 Schematic view of the designed Baugh Wooley Multiplier
Figure 5.14 Simulated input and output waveforms of Baugh Wooley Multiplier
Conclusion
The 1-bit full adder is a very important component in digital signal processor (DSP) architecture and microprocessors. In this project, a new 10-transistor 1-bit full adder I proposed. The proposed design successfully embeds the buffering circuit, which helps in restoring the output voltage swings to satisfactory levels while retaining the transistor count as 10, the least reported so far. The cell is implemented in micro wind along with the various existing 1- bit full adder designs. The study is carried out for 70nm standard CMOS technology, which includes power. The new 10-transistor design, consumes the least power compared to the various other standard designs.Power optimization can be done at different levels at the design, i.e at system level, algorithm level, architecture level, logic level, ckt level, etc.Here a 1-bit full adder is designed for low power and minimum area. This can be called as a high performance design as the number of transistors is reduced in the circuit which consumes less power. Here it is observed that a 65-70% of area overhead is reduced as well the power.Different multipliers are constructed using this full adder for low power, which is an important block of the design as this is repeatedly used. To produce each and every partial product more than 50% of performance improvement is observed for every multiplier that is being constructed.When the circuit is in standby mode there is always a possibility for leakage current due to reverse bias of the PN junction.Eventhough different leakage reduction techniques available, an MTCMOS approach is chosen to reduce the leakage by placing MTCMOS cell as a header or footer. Here particularly nmos sleep transistor is used(comparatively less size to pmos) as an MTCMOS cell Leakage power is also reduced thus by reducing the design to reduce still more power.
Different multipliers designs are compared for different technologies and it is observed that overall power reduction is more compared to conventional designs.
REFERENCES
[1].Flavio Carbognani, Felix Buergin, Norbert Felber, Hubert Kaeslin,Member,IEEE,&Wolfgang,Fichtner,Fellow,IEEETra nsmission Gates Combined With Level-Restoring CMOS Gates Reduce Glitches in Low-Power Low-Frequency MultipliersIEEE TRANSACTIONS ON VERY LARGE SCAL,INTEGRATION(VLSI)SYSTEMS,VOL.16,NO.7,JUL Y2008. [2].John P.UyemuraCMOS LOGIC CIRCUITDESIGNSpringer international edition-2005
[3]. John P.UyemuraChip Design for Submicron VLSITHOMSON INDIA EDITION-2007. [4].Neil H.E.WESTE, David Harris, Ayan BanerjeeCMOS VLSI DESIGNThird edition PEARSON Education INDIA EDITION-2006. [5].Douglas A.Pucknell, Kamran EshragianBASIC VLSI DESIGNPrentice Hall of INDIA PVT LTD THIRD EDITION-2005.-
M. Alioto and G. Palumbo, Analysis and comparison on full adder block in Submicron technology, IEEE Trans. Very Large Scale Integer. (VLSI) Syst.,vol. 10, no. 6, pp. 806823, Dec. 2002.
-
J.-H. Chang, J. Gu, and M. Zhang, A review of 0.18- m full adder Performances for tree structured arithmetic circuits, IEEE Trans. Very Large Scale Integer. (VLSI) Syst., vol. 13, no. 6, pp. 686695, Jun. 2005.
-
A. M. Shams, T. K. Darwish, and M. A. Bayoumi, Performance analysis of Low-power 1-bit CMOS full adder cells, IEEE Trans. Very Large Scale Integer. (VLSI) Syst., vol. 10, no. 1, pp. 2029, Feb. 2002.
-
J. Sulistyo and D. Ha, 5 GHz pipelined multiplier and MAC in 0.18m Complementary static CMOS, in Proc. IEEE International Symposium On Circuits and Systems (ISCAS, Bangkok, Thailand, May 2003, pp.117120
-
C. S. Wallace, A suggestion for a fast multiplier, IEEE Trans Comput. vol. 13, no. 1, pp. 1417, Feb. 1964.
-
P. C. H. Meier, R. A. Rutenbar, and L. R. Carley, Exploring multiplier Architecture and layout for low power, in Proc. IEEE Custom Integer.Circuits Conf.
(CICC), May 1996, pp. 513516.
-
M. S. Elrabaa, I. S. Abu-Khater and M. I. Elmasry, Advanced Low-Power Digital Circuit Techniques, Kluwer Academic Publishers, 2000.
-
K. Roy and S. C. Prasad, Low-Power CMOS VLSI Circuit Design, John Wiley & Sons, 1999.
-
A. P. Chandrakasan and R. W. Brodersen, Low-Power Digital CMOS Design,Kluwer Academic Publishers, 1995.
-
D. A. Pucknell and K. Eshraghian, Basic VLSI Design, Upper Saddle River: Prentice Hall, 1994.
-
A. Bellaouar and M. I. Elmasry, Low-Power Digital VLSI Design Circuits and Systems, Kluwer Academic Publishers, 1997.
-
W. Wolf, Modern VLSI Design Systems on Silicon, Upper Saddle River: Prentice Hall, 1998.
-
J. B. Kuo and J. H. Lou, Low Voltage CMOS VLSI Circuits, John Wiley & Sons, 2000