Efficient High Speed Computing Low Power Multiplier Architecture using Vedic Mathematics For Digital Signal Processing Applications

DSP(Digital signal processing) applications typically invole complicated mathematical operations to be performed repetitively on data samples with less delay and power consumption. Multiplication based operations like multiply and accumulate(MAC) and inner products are few computationally Aggressive Arithmetic Functions(CAAF) frequently contained in today’s VLSI systems and DSP applications like FFT, Finite Impulse Response filters etc. Design of a high speed 4/8 bit Vedic multiplier based on the techniques of ancient Vedic Mathematics is proposed in this manuscript. Vedic method of multiplication strikes a difference in the conventional method of multiplication. Verilog HDL is employed for designing the structural modules, while Xilinx ISE tool is used for validating the designs. Keywords— Vedicmultiplier, fast binary multiplication, HDL simulation, synthesis.


INTRODUCTION
Multipliers are the primary arithmetic operational units that determine the execution time in many digital signal/image processing, networking and communication based applications. For the past many decades multiplier designs are revolving around parameters of consideration like high speed, low power consumption, regularity of layout and hence less area or even combination of them (with trade-offs among them) thus paving to implement compact digital systems. Owing to these parameters, technological advancements are urging researchers to innovate novel multiplication algorithms based on various serial, parallel and hybrid multiplication techniques. Although there exist several types of multiplication algorithms, we have chosen to implement a digital multiplier suitable for digital signal and image processing applications using Vedic mathematics based on the ensured guarantee of faster and accurate calculations by Vedic mathematics.
In twentieth century Vedic mathematics name is discovered which is given to the ancient Indian system of mathematics. The system of Vedic Mathematics was entirely based upon sixteen Sutras or word-formulae and thirteen subsutras [8] [9]. Vedic mathematics reduces the complex calculations that exist in conventional mathematics. Out of all the sixteen sutras available in Vedic mathematics, only two of them are made use of in multiplication operations. They are Urdhva Triyakbhyam sutra and Nikhilam Sutra. Urdhva Triyakbhyam sutra deals with vertical and cross wise where as Nikhilam deals with all from 9 and last from 10.Techniques of Vedic mathematics for digital multiplication are quite different from conventional multiplication techniques. S Akhter et.al [1] implemented fast N×N multiplier in Vedic mathematics using VHDL, as it gives effective utilization of structural method of modelling [1]. B. D. Kumar et.al [2] highlighted the efficiency of Vedic multiplication where the design complexity is drastically reduced modularity greatly increased for inputs having large no of bits. H. Goyal et.al [3] implemented fast multiplier based on Vedic mathematics using modified square root carry select adder, this method reduces the carry propagation delay. Since adders are main components used in multipliers, using fast adders definitely enhances the overall performance of Vedic multipliers. Various Adder topologies like Ripple Carry Adder (RCA), Carry Select Adder (CSA), Square Root Carry Select Adder (SQRT-CSA) etc., are used to compare area, delay and power [4]. Since squaring operation plays a vital role in high speed applications (where the speed is a crucial performance characteristic) like animation, Digital signal processing, and image processing, etc. Yavadunam sutra and bit reduction techniques are employed for squaring binary numbers in High speed VLSI architecture [5].
Multipliers employed in binary cubing architectures are designed using Anurupyena sutra of Vedic mathematics. Combinational path delay is reduced and it achieves better area in terms of slices. Comparison is made between conventional and Vedic methods [6]. These are faster than conventional square and cube because of its regular and and parallel structure of Vedic square and cube. Because of reduced hardware complexity and delay the proposed Vedic multipliers square and cube can be implemented in Arithmetic and Logic Unit designs replacing the traditional square and cube circuits [7]. Most of the computing time is occupied by squaring operation and it becomes to increase the speed which are to be squared.
Remaining sections of the manuscript are organised as follows. Section II deals with Different multiplication algorithms and section III deals with the proposed Vedic multiplication architecture. Experimental evaluations and analysis are depicted in section IV and section V concludes the manuscript.

II. DIFFERENT MULTIPLICATION ALGORITHMS
Multipliers play a prominent role in various digital signal/image processing, wireless communications, Networking, Embedded Systems and various applicable areas. The basic arithmetic operations -addition and multiplication are fundamentally present in majority of the electronic circuits. Statistics clarify the facts that more than 70% instructions in microprocessors (and/or microcontrollers) and DSP algorithms include multipliers. Usage of multiplier circuits always compel designers to trade for added complexity and increased silicon area requirements with enhanced speed and power consumption etc. Of course there are numerous number of multipliers having unique pros and cons for themselves which include the Grid Method, Binary Multiplier, Shift and Add multiplier, Dadda Multiplier, Array multiplier and Sequential multiplier etc. Few of the multiplication algorithms are discussed in this section.
A. Grid Multiplication Algorithm Grid multiplication algorithm represents an introductory method of multiple-digit multiplication; it remained a standard part of mathematics curriculum in England since late 1990s. Process includes partitioning the both factors to be multiplied into their hundreds, tens and units parts. Further partial products are calculated explicitly in a simple multiplication-only stage and these contributions are totaled to generate the final product in separate addition stage. For example, calculation of 34 × 13 is computed using the grid as shown in fig 1.  Similarly if the multiplier bit is '1', then the multiplicand bits are added to the accumulator and resultant partial product is shifted to the right by one bit. After testing all multiplier bits, the product is retained in the accumulator. The accumulator is 2N (M+N) in size. This circuit carry several advantages in asynchronous circuits. Architecture for shift and add multiplication is shown in the figure.3 below.  F. Sequential Multiplication Algorithm Sequential multipliers are mostly preferred because of their low area requirements. A sequential multiplier multiplies two binary numbers (multiplicand X -n bits and multiplier Y -m bits) using single n bit adder, firstly the circuit processes a single partial product at once and it repeats the it to m times. Figure 6 below shows the partial product generation and addition in a sequential multiplier. Here the multiplication process is divided into some sequential steps such that in each step, the generated partial products are added to an accumulated partial sum. Further the partial sum is shifted to align the accumulated sum with partial product in next steps. In every step sequential multiplier includes three operations to generate partial products, adding partial products to the together partial sum later shifting the partial sum.Therefore, every step in sequential multiplier includes three different operations of generating partial products, adding partial products to the accumulated partial sum and shifting the partial sum.

III. PROPOSED VEDIC MULTIPLICATION ALGORITHM
The word "Vedic" is procured from the word "Veda".  Architectural design of proposed Vedic multiplier is shown in Figure 7. above. For 4-bit multiplication Input bits are divided in two equal parts, cross and vertical product computations are performed. Input data is A=A3A2A1A0 and B=B3B2B1B0, while the output bits represented by S=S1S2S3S4S5S6S7S8. Proposed architecture is comprised of designing different structural modules like:

A) Four 2-bit binary multipliers B)
Two 4-bit parallel adders C) One 2-bit parallel adder. D) One Half Adder Detailed operations relevant to structural details of above modules are elaborated below in a sequential manner.

A) TWO BIT BINARY MULTIPLIER:
Design of 2-bit binary multiplier is meant for multiplying the two 2-bit binary numbers and provides the result. Methodology of multiplication is based on calculating the partial products, shifting them and adding them together. The logic diagram of 2-bit binary multiplier (as depicted below in Figure 8) contains two Half Adders and AND Gate units.

B) 4-BIT PARALLEL ADDER:
A full adder can take three inputs and performs addition operation with an input carry. But a Parallel Adder is a digital circuit, which is a cascade of several full adders. Such a nbit adder formed by cascading n full adders (FA1 to FAn) is used to add two n-bit binary numbers. A ripple-carry adder is a simple form of parallel adder, where the carry-out of each full adder is connected to the carry-in of the next full adder. Hence the total delay time of the adder is the time it would take for a carry to ripple through all bit-pair full adders. Block diagram of a 4-bit parallel adder is shown in Figure 9 below.  The logic diagram of 2-Bit Parallel Adder is shown in Figure  11 below.

D) HALF ADDER:
Half adder is constructed using an EX-OR gate and an AND gate, it is meant for adding 2 bits and generate a sum bit (S) and carry bit (C) as the outputs. Though half adder is a simple circuit it has a major disadvantage of adding only two input bits (A and B) and has nothing to do with the carry (if there is any in the input). That means the binary addition process is not complete and hence it is called a half adder. Logic diagram of an Half Adder circuit is shown in Figure 12. Below. V. CONCLUSION With regards to operation of various multiplier circuits parameters like operating speed, power consumption and silicon area under usage are of primary concern, that are in turn dependent on the typical adder circuits deployed. Proposed Vedic mathematics binary multiplier circuits are explored with binary parallel adder circuit components. Here it is observed that parallel adders are more efficient in terms boperation. In contrast Carry Increment Adder circuits are meant for future deployment as they require only little `additional logic and reduced delay at the cost of circuit area.