 Open Access
 Total Downloads : 2085
 Authors : E. L. K. Priya, Prof. P. Lakshmi Sarojini, G. Rajesh Kumar
 Paper ID : IJERTV1IS8651
 Volume & Issue : Volume 01, Issue 08 (October 2012)
 Published (First Online): 29102012
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Area Efficient Fixed Width Modified Booth Multiplier
E. L. K. Priya1, Prof. P. Lakshmi Sarojini2, G. Rajesh Kumar3
Swarnandhra College of Engineering & Technology1,2, Vishnu Institute of Technology3
Abstract
This paper proposes the design of area efficient fixed width modified booth multipliers for lossy applications such as multimedia applications. Area efficiency can be obtained by omitting hardware circuit required to generate the lower half part of the partial product matrix of Booth multiplier which leads to truncation error. Accuracy can be improved by way of modifying the partial product matrix of Booth multiplication and subsequently deriving an effective error compensation function which is composed of basic gates.In addition a simple compensation circuit composed of basic gates, derived from sorting network is also proposed for further improvement of accuracy. Compared to the Post Truncated Multiplier (PTM), the proposed Fixed width Multiplier requires very less hardware.The generated partial products addition is implemented using Dadda tree for minimization ofthe delay and area. To reduce the area, the adder and Exclusive OR are implemented using 11 transistors and 6 transistors respectively.
Keywords: Error Composition circuit, fixedwidth multiplier, modified booth multiplier, Partial product matrix, Error compensation function, Post Truncated Multiplier

Introduction
Multipliers are key components of many high performance systems such as FIR filters, microprocessors, digital signal processors, etc. A systems performance is generally determined by the performance of the multiplier because the multiplier is generally the slowest clement in the system. Furthermore, it is generally the most area consuming. Hence, optimizing the speed and area of the multiplier is a major design issue. However, area and speed are usually conflicting constraints so that improving speed results mostly in larger areas. In parallel multipliers, high performance can be achieved by using modified Booth encoding ([1][3]), which decreases the number
of partial products by a factor of 2 through multiplier recording. Fixed word size in lossy systems that allow small accuracy loss to output data, can be maintained by using n fixed width multipliers which result in only n most significant product bits. Complexity reduction in hardware and significant area and power savings can be achieved by removing the adder cells of standard multiplier for the computation of the n less significant bits of 2nbit output product. However, the fallout being high truncation error will be introduced into the system to such kind of directtruncated fixed width multiplier.
Variety of error compensation methods that add estimated compensation value to inputs of the reserved adder cells, which effectively arrests the truncation error. As it is universally known that error compensation value can be derived by the Adaptive/Constant scheme. Regardless of current input data value influence, the Constant Scheme ([4],[5]) arrives to constant error compensation value and same is used to carry inputs of the retained adder cells while performing multiplication operations.Less hardware results in a relatively large truncation error to the Constant Scheme.Adaptive scheme ([6][10]) has been designed such that high accuracy is achieved by adjusting the compensation value according to the input data at the cost of higher hardware complexity. However, most of the adaptive error compensations are developed only for Fixedwidth array multipliers & has less influence on reduction of truncation error for a fixed width modified booth multipliers directly.
Various error compensation approaches ([11], [12]), have been proposed toeffectively scale down the truncation errors of Fixedwidth modified Booth Multiplier at the cost of hardware complexity. Simple error compensation circuit results in better error performance and area with booth encoded outputs as inputs to generate the error compensation value. A systematic design methodology for area efficient fixed width modified booth multiplier through exploring the influence of various indices in a binary threshold was developed to reduce the product error can be seen in [13].
Many DSP applications and Multimedia, the output data has direct bearing to the accumulation of a series of products over a single multiplication operation. In
those cases, truncation errors results in large output error, which can be countered by performing additional compensation, which isagain dependent on different applications needing different compensation values.
Thus to mitigate such kind of customizations, to decrease the accumulated output error, fixedwidth multipliers with very minute error with simple errorcompensation circuit are extensively preferred to obtain more accurate output data.
Through this Paper simple error compensation circuit for fixedwidth modified Booth multiplier is proposed which reduces most of the hardware. In order to achieve desired outcome, slight modification of the partial product matrix of Booth multiplication to reduce the partial product bits in the truncated portion of DTFM is required. Thereby, we ascertain the correlation between the Booth encoded outputs and the truncated product error of DTFM is analysed to derive a simple yet effective compensation function, which can result in an approximation to the carry value generated by truncated portion of DTFM, to reduce the errors result in truncation and make the error distribution as symmetric and centralized as possible. Subsequently, developing of simple modified sorting network along with some adder cells as per theproposed error compensation function. Results derived from implementation and simulation depict that the proposed fixedwidth modified Booth multiplier occupies less area and achieves approximately same accuracy as PTMwhich is most accurate existing fixed width modified Booth multipliers in terms of the mean error and the meansquare error, all of this still by maintaining the approximate Hardware overhead .

Booth Multiplier Architecture
In Booth multiplier Multiplication operation when considered between multiplicand and the multiplier namely A & B respectively, representation of which is depicted as below:
(1)
(2)
Modified Booth encoding groups the multiplier bits into triplets, and are encodedas given in table I, then B can be expressed as:
Where b1 =0 and kj belongs to the set of values {2, – 1,0,1,2}.
Table 1. Modified Booth encoding table
Booth encoder i/p
Operation
Bo
oth
encoder
o/p
b2j+1b2jb2j1
njtjojzjcj
0
0
0
+0
0
0
0
1
0
0
0
1
+A
0
0
1
0
0
0
1
0
+A
0
0
1
0
0
0
1
1
+2A
0
1
0
0
0
1
0
0
2A
1
1
/td>
0
0
1
1
0
1
A
1
0
1
0
1
1
1
0
A
1
0
1
0
1
1
1
1
0
1
0
0
1
0
According to the Modified Booth encoding table the existing booth encoder and partial product generation circuit are shown in fig. 1.a and fig.1.b respectively.
Fig 1.a) Booth encoder circuit b)PP generation circuit
(3)
Fig 2. Partial product matrix of Booth multiplier
With relation to negation operation, each bit of multiplicand A is complemented and an extra binary value 1 is added to least significant bit pertaining to next partial product row. To implement correction bit cj addition of 1 is being used and thereby indicating the partial product row PPj as positive (cj=0) or negative (cj=1). As each partial product row is represented in twos complementation, the sign bit for each PPj must be properly extended upto the (2n1)th bit position. Many ways [14] are proposed to give a solution to eliminate the problem of sign extension bits as they affect the performance and power consumption of the parallel multiplier with large values of n. The partial product matrix of an 8 modified booth multiplier with sign extension elimination technique is illustrated in Fig2, where in pj,k and sj denote the kth product bit and the sign bit of partial product row PPj, respectively.

Area efficient Fixed Width Modified Booth multiplier
The partial product matrix is bifurcated into two segments namely MSP(Most Significant Part) & LSP(Least Significant Part), where LSP is further bifurcated into LSPminor and LSPmajor as shown in fig. 3 (except 1).A truncated multiplier is an nÃ— n multiplier with n bits output. As the partial products divided into two subsets LSP includes the n less significant columns of the partial product matrix, while the MSP includes the remaining columns the fullwidth multiplier output, P is given by(4)where S(MSP) and S(LSP) represent the weighted sum of the elements of MSP and LSP respectively.
P =S(MSP) + S(LSP) (4)
In the direct truncated multiplier the partial products of the LSP are discarded assuming that their contribution to the n most significant bits of the output is negligible. This solution is very advantageous in terms of hardware performances. The cells for the LSP matrix are not present and the final circuit halves the number of cells compared the fullwidth one. However, a straightforward analysis shows that the direct elimination of the partial products of the LSP causes a very big error bounded by n(2 1)LSB.
In the posttruncated modified Booth multiplier (PTM) the with the addition of an extra l(carry value by LSPminor) at the (n1)th bit position of partial product matrix as depicted in Fig 3 (including1)then outputs the highly significant n product bits. With this the most
accurate error compensation value Ãª is generated by PTM. Then final multiplication result P of A and B can be expressed as:
P=S(MSP)+S(LSPmajor)+S(LSPminor)+1 (5)
(in(n1)th position)
The main disadvantage of PTM is approximate hardware complexity and power consumption to the standard modified Booth Multiplier.
As Partial product bits are generated from the booth encoder outputs, exploration of the relation between outputs of the booth encoders and the carry propagated from LSPminor to LSPmajor is to be made. Then, to reduce the truncation error, an effective and simple error compensation function, whose inputs are the outputs of booth encoders, accordingly generates the approximate carry value. Finally, to reduce the area of fixed width modified booth multiplier, a simple & fast compensation circuit is derived, ensuring that truncation error is minimal.

Modified partial product matrix to improve the accuracy
Accuracy can be improved by decreasing the partial product bits in LSPminor which modify the partial product matrix of PTM. To obtain the new matrix first the LSB pn/21,0 of PPn/21 and cn/21are added in advance which produce n/21 as sum an as carry at (n2)th and (n1)th positions respectively. As the weight of extra 1 in PTM and are same, both can be added which outputs as sum & as carry propagated to the nth bit position. Then this carry is incorporated with sign extension bits of PP0 which modify to new partial product bits 210.
Fig 3.Partial product matrix of PTM.
The truth tables to generate n/21, and 210 are shown in table 2, 3 respectively and corresponding logic equations are given below.
(6)
= . . (7)
Fig 4.Error compensation function generation circuit.
The modified partial product matrix of 8 Booth multiplier is shown in fig.5.
(10)
(8)
(9)
Table 2.Truth table for n/11,
t n/21,0o n/21,0z n/21,0
cn/21,0
pn/21,0
n/21,0
0
0
1
0
0
0
0
0
1
0
0
a0
a0
0
0
1
0
1
0
a0
0
1
0
0
0
0
0
0
1
0
0
1
1
0
1
Table 3.Truth table for 210
S0
0
Cout0
S0
1
Cout1
0
2
1
0
1
0
1
1
0
0
0
1
1
0
1
1
0
1
0
1
0
0
0
0
0
0
0
1
1
0
1
1
0
0
0
0
1
1
According to logic equations (6 10) the circuit to generate 210& is shown in fig.4.
Fig 5.Proposed modified partial product matrix

Proposed low error Area efficient errorcompensation circuit
In the modified partial product matrix the new LSP, MSP, LSPminor&LSPmajor are denoted asLSP`, MSP`,LSP`minor&LSP`major respectively as shown in fig.

The partial product bits in LSP`minor are omitted in the fixed with multipliers results accuracy loss ,which can be compensated by a simple modified sorting network so that hardware required to generate partial products in LSP`minoris removed instead of which a simple compensation circuit composed of basic gates is used. Hence, developing a simple error compensation function whose output value approximates the most accurate error compensation value Ãª as follows:
Now the error compensation function Ãª can be expressed as:
Ãª=[0.5(major+)+CP(LSP`minor)] (11)
Where CP(LSPminor) represents approximate carry value propagated from LSP`minor to LSP`major. To explore the correlation between CP(LSPminor) and the outputs of the booth encoder, conider the encoder output zj. If it is one, the partial product bit of PPj in LSP`minor must be equal to zero. The different combination of zj for 0 j n/21, can be represented by a generalized index is defined as (12).
= n/21 2n/21 + n/22 2 n/22 +.+ 0 20 (12)
According to (12), the range of is from 0 to 2n/21. For a specific , all combinations that produce the
same can be utilized to calculate the average value of S(LSP`minor)denoted as S(). The normalized value of S() is S( )/2n1 can be substituted for CP(LSP`minor). Thus the compensation error function can be written as )+S( )/2n1] (13)
The correlation between and S()/2n1 for 8 bit fixed width modified booth multiplier is shown in Table 4.
Table 4.The correlation between and S()/n1 with n=8
S( )/2n1I()
S( )/2n1
I()
0 0
0
8 0.1665
0
1 0.5025
0
9 0.66930
2 0.5105
0
10 0.67710
3 0.99890
111.1655 1
4 0.54150
12 0.70810
50.98850
13 1.15531
60.9963
0
14 1.16311
71.50351
15 1.67031
In are always integers. Moreover Ãª is also an integer. Then the contribution of S( )/2n1 to Ãª can be approximated to an integer I() as follows:
I() = S()/2n1 (14)
I() can be related to summation jfor 0 j n/21
Letj(15)
The relation between &I() with n=8 is shown in Table 5.From Table.5 the relation between I()& can be expressed as:
I() = 1)/2 (16)
The final error compensation function can be obtained by, substituting (14) into (13), then (16) into that equation.
Ãª=[0.5(major+)+(1)/2] (17)
According to (17), by using compression tree structure and (1)/2can be compressed with the
partial product bits in MSP to generate the final fixed
width product.major, can be generatedfrom the partial product bits in LSP`major and MSP.
Table 5.The correlation and I() with n=8
I()
I()
I()
I()
0 0
4 1
8 1
12 2
1 1
5 2
9 2
13 3
2 1
6 2
10 2
14 3
3 2
7 3
11 3
15 4
For quick production of 1)/2 a simple and efficient circuit denoted as SCgenerator is to be designed.By (15),the inputs of SCgenerator are j for 0 j n/21 which generates moutputs as 1, 2,..m where m=(n/21)/2, which is depicted in Fig.6.
Fig.6.SCgenerator block diagram.
Summation of outputs of SCgenerator is equal to I(). The proposed SCgenerator is composed of a modified sorting network instead of adder cells based on following observation.Sorting jresults in largest bits gathering at least significant positions, then k= 2k for 1 k m, where k is the output of sorting network for 0 k n/21.To design the SCgenerator, consider the different sorting networks such as bitonic and oddeven merge sorting networks, which suits for hardware implementation. Out of these, oddeven merge sorting network requires less hardware thus a simplified odd even merge sorting network is adopted in implementing SCgenerator. The oddeven merge sorting networks for n=8 & 16are shown in Fig.7.
The odd even merge sorting network is modified to derive the desired SCgenerator which is composed of basic gates. The proposed SCgenerator is shown in fig.8.

(b)

Fig.7.Odd even merge sorting networks for n=8 &n=16 respectively.
The SCgenerator for different n can be constructed in the same manner. The outputs of SCgenerator are fed into LSP`major to produce final fixed width product.
Fig.8.Proposed SCgenerator.
The partial product matrix of the proposed fixed width booth multiplier for n=8 is depicted in Fig.9.In the final partial product matrix, the partial product bits in
LSPminor and carries thus generated are substituted by SCgenerator.
Fig.9.Final proposed partial product matrix


Final circuit &implementation
This paper mainly deals with drawing the layout of the proposed fixed width modified Booth multiplier for n=8. Fig.10 shows the block diagram of the final circuit.
Fig10.Block diagram of the proposed multiplier.
The above block diagram is implemented using Digital Schematic tools, and layout is implemented in Bottom up approach using IC station tools.The proposed multiplier is suffering from little accuracy loss compared to full widthmultiplier which is negligible formost of the lossyapplications such image processing etc.
The partial product generator is slightly modified to reduce the Mos transistors. In the partial product generator the multiplexer which require 14 Mos
transistors,is replaced with exclusive or gate which require 6 Mos transistors. The implementation of EXOR gate with transmission gate is shown in fig.11.
Fig.11.Exor gate implementation using Transmission gates
Fig.14 Layout of GDI based 11 transistor full adder


EXPERIMENTAL RESULTS
The comparison of hardware required for proposed multiplier with PTM is given in Table 8.
Table 8.Thecomparison of components requirement forPTM and proposed multiplier
Component
PTM
Proposed
multiplier
NAND
gate(3i/p)
1
NAND
gate(2i/p)
4
6
NOR gate(2i/p)
4
8
NOT gate
3
5
OR gate(3i/p)
32
16
AND gate(2i/p)
76
41
OR gate(2i/p)
1
EXOR
gate(2i/p)
12
32
Half Adder
14
3
Full Adder
31
17
Multiplexer
36
The internal circuit of p07 is shown in fig.12.Similarly P17, P27, P3can be designed.
Fig.12 Internal circuit of P07
Parallel multipliers are often implemented as either array multipliers or as multiplier trees. Hardware saving and less delay can be achieved by Dadda tree ([15],[16]) implementation. Here the adder is implemented using 11 MOS transistors which is shown in fig. 13 and its layout is shown in fig. 14.
Fig.13 GDI based 11 transistor full adder
The multiplier descriptions are mapped on a 0.5 mCMOS standardcell library using synthesis tool from Mentor Graphics.From the experimental analysis,area is significantly reducedcompared to PTM. Results of the simulation clearly show that the proposed multiplier architecture requires less hardware and performs approximately same as existing PTM.

Conclusion
Fixed width Modified Booth multiplier is designed and is verified funcionally. In this new approach lower half of the partial product matrix is omitted. This leads to
truncation error but the chip area reduces considerably. Here 18% reduction in aea is observed for 8 multiplication.In high speed applications and where accuracy is not much important this type of structure is well used. Because the lower half of the partial products are omitted hardware required to construct those partial products also not constructed. This leads to less area, Low power consumption and High speed.

References

G. O. Young, A. Inoue, R. Ohe, S. Kashiwakura, S. Mitarai, T. Tsuru,and T. Izawa, A 4.1ns compact 54 x 54 multiplier utilizing signselectBooth encoders, IEEE J. Solid State Circuits, vol. 32, no. 11, pp. 16761682, Nov. 1997.

K. Choi and M. Song, Design of a high performance 32 x32bit multiplier with a novel sign select Booth encoder, in Proc. IEEE Int.Symp. Circuits and Systems, 2001, vol. 2, pp. 701704.
[4]S. S. Kidambi, F. ElGuibaly, and A. Antoniou, Area efficient multipliers for digital signal processing applications, IEEE Trans. CircuitsSyst. II, Exp. Briefs, vol. 43, no. 2, pp. 9094, Feb. 1996.

M. J. Schulte and E. E. Swartzlander, Jr., Truncated multiplication with correction constant, in Proc. VLSI Signal Processing, VI, New York, 1993, pp. 388396.

L. D. Van and C. C. Yang, Generalized lowerror area efficient fixedwidthmultipliers, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52, no. 8, pp. 16081619, Aug. 2005.

J.S.Wang, C.N. Kuo, and T.H. Yang, Lowpower fixed width array multipliers, in Proc. Int. Symp. Low Power Electron.Des., 2004, pp. 307312.

L. D. Van, S. S.Wang, and W. S. Feng, Design of the low error fixed width multiplier and its application, IEEE Trans. Circuits Syst. II, Exp.Briefs, vol. 47, no. 10, pp. 1112 1118, Oct. 2000.

J. M. Jou, S. R. Kuang, and R. D. Chen, Design of low error fixedwidthmultipliers for DSP applications, IEEE Trans. Circuits Syst.I,Exp. Briefs, vol. 46, no. 6, pp. 836842, June 1999.

Y.C. Liao, H.C.Chang, and C.W.Liu, Carry estimation for twos complement fixedwidth multipliers, in Proc. IEEE Workshop SignalProcessing Systems, 2006, pp. 345350.

S. J. Jou, M.H.Tsai, and Y.L.Tsao, Lowerror reducedwidth Booth multipliers for DSP applications, IEEE Trans. Circuits Syst.I, Fudam.Theory Appl., vol. 50, no. 11, pp. 14701474, Nov. 2003.

K.J. Cho, K.C.Lee, J.G.Chung, and K. K. Parhi, Design of lowerrorfixedwidth modified Booth multiplier,
IEEE Trans. Very LargeScaleIntegr. (VLSI) Syst., vol. 12, no. 5, pp. 522531, May 2004.

M.A. Song, L.D.Van, and S.Y.Kuo, Adaptive low error fixedwidthBooth multipliers, IEICE Trans. Fundamentals, vol. E90A, no. 6, pp. 11801187, Jun. 2007.

E. de Angel and E. E. Swartzlander, Jr., Low power parallel multipliers, in Workshop VLSI Signal Process., IX, 1996, pp. 199208.

L. Dadda, Some schemes for parallel multipliers, Alta Frequenza,vol. 34, pp. 349359, 1965.

K. C. Bickerstaff, E. E. Swartzlander, Jr., and M. J. Schulte, Analysis of column compression multipliers, in Proc. 15th IEEE Symp. ComputerArithmetic, 2001, pp. 33 39.
E.L.K.Priya:persuingM.Tech (VLSI System Design) in Swarnandhra college of Engineering & Technology, India. Has four years of teaching experience. Major working areas are Digital electronics, Embedded Systems and VLSI.
P.Lakshmi Sarojini received M.Tech degree in DSCE from JNTU,Hyderabad. She worked in DRDO as a Scientist and presently working asProfessor & Head, Dept of ECE, Swarnandhra College of Engineeringand Technology Narsapur. Her area of research includes IC design,Mixed signal VLSI and programming digital systems.
G. Rajesh Kumar: Asst. Professor in Vishnu institute of technology, India. Has five years of teaching experience. Major working areas are Digital electronics, Embedded Systems and VLSI. Presented research papers in five international conferences and three national conferences.Published three research papers in International Journals.Research Scholar in JNTUK.