 Open Access
 Total Downloads : 1058
 Authors : K. Lavanya, K. Navatha
 Paper ID : IJERTV3IS090839
 Volume & Issue : Volume 03, Issue 09 (September 2014)
 Published (First Online): 25092014
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
IEEE 754 Floating Point Multiplier using Carry Save Adder and Modified Booth Multiplier
K. Lavanya K. Navatha
M.Tech (EMBEDDED SYSTEMS) Assistant Professor
Electronics and Communication Engineering, Electronics And Communication Engineering SR. Engineering, college, SR. Engineering, college,
Warangal, India. Warangal, India
AbstractThis paper describes design and implementation of IEEE754 single precision floating point using carry save adder (CSA) and modified booth algorithm(MBA) and design is compared in terms of speed, area and power consumption. Adders were used to increase the speed; multipliers will help in reducing power, area and also reduces generation of partial products and speed up operation which helps in producing accurate results. Floating point design involves detection of exceptions and trap overflow and underflow. The algorithm is implemented in Verilog HDL and simulation results observed by using Xilinx ISE8.1i. Verification has been implemented on Spartan 3E.
Keywords Carry save adder, floating point multiplier, modified booth algorithm, Verilog HDL.

INTRODUCTION
Floating point multiplier (FPU) is one of the most common elements in digital applications such as Digital signal processing, control units and image processing because of its fast and rapid execution of arithmetic operations in a circuit. Floating point (FP) acts as co processor that comprises arithmetic operations, adders and multiplications. Floating point unit is used in all computer processors which play an important role in processor to perform maths coprocessor which is designed to carry out operations on floating point numbers, and it plays a very important role in high performance applications such as microprocessor, digital signal processing (DSP) [1] and FIR filters. Math processor will help in increasing speed and capability of a processor. In many applications programmers has to put effort for maintaining proper scaling of numbers to maintain precision and also to avoid overflow for such type of applications we use floating point which is specifically used to control the desirability of dynamic ranges of fixed point operations.
Now a day in many general purpose computer architectures, one or more floating point units (FPU) are integrated with central processing unit (CPU). Floating point unit is an execution unit which is mainly designed to perform mathematical operations like addition, subtraction and multiplications and it also performs various transcendental functions such as trigonometric or exponential calculations and these operations are done
through software library routines in many advanced processors
In this paper single precision floating point multiplier unit is implemented using pipelining and without pipelining method. The multiplier and adder units are implemented using modified booth multiplier and carry save adder (CSA) [7]. Carry save adder is one of the fastest adder used in digital circuits increase speed and reduces area, power, and delay modified booth multiplier will help in increasing speed and reduce generation of partial products by this it reduce complexity of multiplication. Floating point (FP) using pipelining method will improve continuous clock cycle results and reduces consumption, latency and increases speed.

IEEE754 FLOATING POINT UNIT
2.1 IEEE74FLOATINGPOINT STANDARDS
IEEE754 standard is s standard representation established by IEEE and widely used standard for floating point computation. Single precession floating point represents computer format and it occupies 32bits in a computer memory and represents a wide range of values by using a floating point. In IEEE 7542008, the 32bit with base 2 formats [2][6] is referred as single precision or binary 32.
The standard basically has four types and they are

Arithmetic format: it is a set of binary and decimal floating point numbers which has finite numbers that contains signed zero, subnormal and infinite numbers and special value called not a number(NaN).

Interchange format: it is a bit string or encodings that are mainly used to exchange floating point data in a compact and efficient form.

Rounding rules: properties should be satisfied while doing arithmetic operations and conversion of any numbers on arithmetic formats.

Exception handling: It indicates any exceptional conditions like overflow, underflow etc., while doing operations.


IEEE754 FLOATING POINT FORMAT
IEEE is a technically used standard followed by many hardware and software implementation. Single precision floating point standard represents 32bits (4bytes) in computer number format, most significant bit will start from left this single precision is having three basic components sign, exponent and mantissa sign bit width is 1 bit, exponent of width is 8bits and mantissa of 24 bits out of which 23 bits are explicitly stored [6] and 1 bit is implicitly stored, sign bit is used represent sign of floating point number where sign(s=o positive numbers=1 negative number)
31 30 23 22 0
S
Exponent (E)
Fraction (F)
1bit 8bit 23 bits
Fig 3.1 32 Bit single precession floating point
The Number representation of single precession [9]. Value = (1) s *2E127 * 1.M (normalized) when E>0 S= sign bit (0 for positive, 1 for negative)
e=unbiased exponent; e=E127(bias)
E=exponent it can be signed or unsigned integer, 8bit signed integer it ranges from = – 128 to
=127(2compliment) and for unsigned it ranges from
=0 to =255 which is the accepted biased form in IEEE 754 single precession. In this exponent with value 127 represents actual zero. The true mantissa will have 23 fraction bits to the right of binary point and implicit to left of binary point with value 1unless exponent bit is stored with zeros. In memory we can see only 23 fraction bits in mantissa even though it has 24bit.If E>0 and E<255 and has 1 in MSB of significant then that number is said to be normalized number.

FLOATING POINT MULTIPLICATION ALGORTHIM
Floating point multiplications are used in many digital circuits and signal processing computations, floating point multiplication design involves overflow and underflow[1][5][9]. In this normalized floating point number have the form of Z=X*Y= ( 1S) * 2 (E – Bias) * (1.M).
To multiply two floating point numbers it involves following steps [8]:

If one or both operand is equal to zero I results zero otherwise

Multiplying the significant bits in mantissa(X*Y)

Adding the exponent of two numbers and then the result will be subtracted from bias (E1+E2Bias).

Obtains the sign bit (X XOR Y).

Normalizing the result if needed, by shifting mantissa to right and incrementing exponent result.

Rounding the results to the allowed mantissa bits.

Checking for overflow and underflow

Overflow occurs when result is greater than maximum exponent.

Underflow, when result is smaller than minimum exponent.
Fig 3. 2. Floating point multiplication flow chart.
3.1.1. Radix4Modified booth encoder
Booth algorithm is an important algorithm that is used to implement signed number multiplication, which treats both positive and negative numbers uniformly [3]. The disadvantages that occurred in booth algorithm can be overcome by using this Modified booth encoder technique, which was proposed by O.L.Macsorely.in modifed booth it is possible to reduce partial products by half i.e. for n multiplier bits we get n/2+1 partial products[4]. Modified booth is twice faster than normal booth multiplier and one of the efficient way by which it can reduce partial products by grouping two consecutive bits and uses two operands to perform signed bit. The booth encoded operand is called multiplier and other operand is multiplicand. In this technique it compares 3bits at a time by using overlapping technique, grouping of bits are taken from LSB, and first block use only two bits and for third bit it will assume as zero [3]. The functional operation of radix4 modified booth is shown in table1.
Radix4 booth algorithm is shown below [3]:

Extend the sign bit 1 position if necessary to ensure that n is even.

Assume a zero at right end of LSB of a multiplier

According to the values of each block taken in multiplier will take partial products by 0, +y,y, +2y,2y.
Table1: Radix4 Modified Booth Encoder



Carry save Adder
Carry save adder (CSA) design [7] is used in high speed multioperand adder. Carry save adder is a standalone full adder and carries out number of partial products, without
any carry chain operations. Carry save adder is mainly used to calculate partial products that are generated by integer multipliers. By using carry save adder it is possible to reduce delay. In carry save adder Nbits contain Ndisjoint full adders [3] in which it contains single sum and carry which depends on inputs. Carry save adder is known as (3:2) counter or compressor that is it takes 3 inputs and produces 2outputs without any carry chain. Carry save adder not only solve adder problems but it also solves different problems that occurs in a digital system, carry save adder can compress nbits by taking any type of adder in place of full adder we take ripple carry adder as ripple adder is cascaded with full adders we use ripple adder as sub block in carry save adder.
Fig.3.3 Carry save Adder

Ripple Carry Adder
Another known adder is ripple carry adder (RCA) and it is composed of cascaded full adders for nbit adder as shown in above fig .ripple carry adder is constructed with the help of full adder blocks and these adders are arranged in series. The carry out of one stage is fed directly to the carryin of the next stage. For an nbit parallel adder it requires n full adders. It is used to obtain the final sum and carry output by adding partial products. In this adder at each stage input ( )is carry output ( )of previous
3.3.Floating point multiplier using pipeling
pipeline technique will help a circuit to operate at higher clock rates by dividing lage task into small subtasks without overlapping,it helps in reducing the complexity of a circuit.pipeline multiplier[10] is suitable to perform higher arthimatic operations such as digital singals.Floating point multiplier architecture is mainly used to reduce delay.pipelined architecture provides fast response and also consumes more area,pipeline produce a flexible data path which is easy o modify then fixed datapath and reduces maintenance time,and it also reduces clock period and increases latency[10].In this floating point multipler using pipeline use modified booth multiplier of 26bits and it is divide into 3 stage pipeline.in this pipeline stage one represent check zero module stage two will check for sign bits and stage three represents normalizer and finall results.
To multiply two operand in pipeline multiplier it involves following steps

Check for zero module that is operands are checked whether zero_flag has zero or one if it as zero then it will initiallise to all stages and resultant output will be zero,if zero_flag is one in any operand then input will assign IEEE754 format and connected to check sign,add exponent and mantissa

Check sign will represent whether operand bits used in pipeline multiplier hast positive sign or negative sign bits i.e.,if both operands have same sign bit then it represent positive sign otherwise negative sign.

Add exponent will show whether module is in active state or not (if zero_flag is set then active otherwise zero will be passed to next state and set zero_flag to 0)

Multiply mantissa will check for zero_flag.if zero is set then no caluculations will be done.if mantissa is set to 0 then multiplier operation will be done, if mantissa has 1 hen it shows that execution is done.

Normalizer will check for overflow and underflow.
adder stages[7] i.e., each carry bit ripples to next full adder and is called as ripple carry adder(RCA)
Logic equations:
=carry generation function
=carry propagation function
= =
+1 = + =
Fig 3.4.Rripple carry adder
Fig 3.5.Stage Pipeline Multiplier



SIMULATION RESULTS
Simulation results shown below are implemented using Verilog HDL using Xilinx ISE 8.1i.and implemented in spatarn3E.
TABLE.2.SYNTHESIS REPORT FOR CARRY SAVE ADDER
Types of Parameters
Used and Available
Number of slices
10 out of 960
Number of 4 input LUTs
17 out of 1920
Number of bonded IOs
25 out of 66
XORs
12
IOs
26
Delay
13.121ns
TABLE.3.SYNTHESIS REPORT FOR RADIX4 MODIFIED BOOTH MULTIPLIER
Types of Parameters
Used
Number of slices
1244
Number of 4 input LUTs
2332
Bonded IOs
105
Adders
53
Registers
52
Multiplexers
13
Flipflops
52
IO Buffers
104
OFFSET
107.637ns
Clock buffer
1
TABLE.4. SYNTHESIS REPORT FOR FLOATING POINT WITH PIPELINING AND WITHOUT PIPELINING
Types of
parameters
Floating point
without pipeline
Floating point with
pipeline
Number of slices
1921
1099
Number of 4 input
LUTSs
3629
2033
Number of bonded IOs
99
99
IOs
99
99
IO buffers
98
98
Adders
53
53
Flipflops
193
193
Registers
193
199
Comparators
4
4
Multiplexers
13
13
Xors
25
25
Clock period
163.749ns
6.625ns
Clock frequency
6.107Mhz
159.899Mhz
delay
81.874
6.254ns
Fig.4.1. RTL Schematic of Floating Point Multiplier without Pipelining
Fig.4.2. Simulation Results for Floating Point Multiplier without Pipeline
Fig.4.3.RTL of Floating Point Pipelined Multiplier.
Fig.4.4.Simulation Results for Floating Point using pipeline

CONCLUSION
In ths paper we implemented single precision floating point multiplier with and without pipelining using Xilinx 8.1i and is implemented in Spartan 3E,and observed that floating point multiplier with pipeline is having high speed performance. It has less delay and less clock period then floating point multiplier without pipelining and also pipelining has reduced complexity and offers high throughput with low latency. It can be implemented in any type of processor and is also suitable for system on chip(Soc) prototyping.
REFERENCES

P. BelanoviÂ´c and M. Leeser. A Library of Parameterized Floating Point Modules and Their Use. In Proceedings, International Conference on Field Programmable Logic and Applications, Montpelier, France, Aug. 2002.

IEEE 7542008, IEEE Standard for FloatingPoint Arithmetic, 2008.

Radix4 Encoder Booth Multiplier

A HighSpeed Multiplication Algorithm Using Modified Partial Product Reduction Tree P. Asadee, International Journal of Electrical and Electronics Engineering, 4: 4, (2010).

D. Sangwan and M. K. Yadav, Design and Implementation of Adder/Subtractor and Multiplication Units for FloatingPoint Arithmetic, in International Journal of Electronics Engineering, (2010), pp. 197203.

An ANSI/ IEEE Standard for RadixIndependent FloatingPoint Arithmetic, Technical Committee on microprocessor of IEEE computer society, October, 1987.

P.Sreenivasulu, Dr. K.Srinivasa Rao, Malla Reddy and Dr.A.Vinay Babu
Energy and Area efficient Carry Select Adder on a reconfigurable hardware International Journal of Engineering Research and Applications (IJERA) ISSN: 22489622 ,Vol. 2, Issue 2, MarApr 2012pp.436440.

IEEE Floating Point Representation of Real Number, Fundamentals of Computer Science. Link: http://www.math.grin.edu/~stone/courses/fundamentals/ieee reals.html

L. Louca, T. A. Cook and W. H. Johnson, Implementation of IEEE Single Precision Floating Point Addition and Multiplication on FPGAs, Proceedings of 83rd IEEE Symposium on FPGAs for Custom Computing Machines (FCCM96), (1996), pp. 107116.

Amine Bermark, Guixuan Liang and Qingzheng, A Highspeed 32bit Signed/ Unsigned Pipelined Multiplier, Department of Electronics and Computer Engineering, Honkong University of Science and Technology, Hong Kong, China.