Analysis of Low Power Parallel and Serial Array Multiplier

DOI : 10.17577/IJERTCONV5IS17037

Download Full-Text PDF Cite this Publication

Text Only Version

Analysis of Low Power Parallel and Serial Array Multiplier


PG Student, Electronics and Communication Engineering,

M. Kumarasamy College of Engineering, Karur, Tamil Nadu

Abstract:- Today digital signal processing the multipliers will play an important role in various applications. Many researchers have tried to design multiplier for high speed and low power consumption in advanced technology. Two important design of multiplier factor is Power consumption and area owing to circuit complexity. In low power parallel multiplier design, whenever their outputs are known as some columns in the multiplier array can be turned-off To reduce the power dissipation the array multiplier can be used in digital image signal processing such as finite impulse response(FIR) filters. In this paper two techniques are analyzed which includes separated multiplication technique and Column Bypassing technique.

KEYWORDS: FIR Filter, Low power, Multiplying circuits, Power dissipation


Huge consideration lately has pulled in low power outline. With cutting edge innovation makes it conceivable to put an ever increasing number of gadgets in a similar silicon region and in the meantime pushes the clock rate considerably higher. To diminish the bundling and cooling costs and in addition drag out the life expectancy of coordinated circuits (ICs) in low power plan.

In VLSI the dynamic power dispersal is normally presented by flag moves in the circuit, To limit the normal power scattering exchanging exercises of a given rationale circuit is decreased.


The quantity of halfway items to be included is lessened, the Altered Stall calculation is a standout amongst the most Mainstream calculation.

The Wallace Tree calculation can be utilized to lessen the quantity of consecutive including stages and speed change is accomplished. Both Altered Corner calculation and Wallace Tree strategy are preference of one multiplier in both calculations.

The measure of movements between the halfway items and moderate totals to be included will increment in parallelism which may bring about diminished speed, increment in silicon zone. Because of increment in interconnect coming about because of complex steering because of abnormality of the structure and furthermore expanded power utilization.

Advanced flag handling (DSP) is the innovation is the heart of the up and coming era of individual portable

correspondence frameworks. To actualize calculations the most DSP frameworks will fuse an increase unit, for example, convolution and separating. In numerous DSP calculations, the execution of the calculation is controlled by multiplier.

In low power systems the significant rationale outline techniques to kill glitches in ordinary calculation exchanging action is decreased. In DSP applications is a basic number juggling operation for increase, for example, sifting and quick Fourier Change (FFT). To accomplish high execution speed, parallel cluster multipliers are generally utilized. The Multipliers will expend more power in DSP calculations, and the power-proficient multipliers are essential for the outline of low-power DSP framework.


The general engineering of the move and include multiplier is underneath 32 bit increase. Contingent upon the estimation of multiplier LSB bits, an estimation of the multiplicand is included and amassed. The multiplier is moved one piece to one side and its esteem is tried for every clock cycle. In the event that it is a 0, then just a move operation is performed. On the off chance that the esteem is a 1, then the multiplicand is added to the aggregator and is moved by one piece to one side. The item in the aggregator has been tried by all the multiplier bits. The gatherer is 2N (M+N) in size and at first the N, LSBs contains the Multiplier. The postponement is N cycles greatest.


Where zone and power is most critical of deferral can be endured the serial multiplier is utilized. To include the m * n incomplete item beneath for m=n=4 the circuit utilized by one viper. Multiplicand and Multiplier inputs have been masterminded in an uncommon way synchronized with circuit conduct. Contingent upon the length of the multiplicand and the multiplier the data sources could be displayed in various rates. Two tickers are utilized, one to clock the information and one for the reset. Estimate of the postponement for first request O (m, n).


Array multiplier factor may be a better option in DSP applications owing to its smaller layout and high through put. Based on standard add and shift operations. Several stages of AND gates and full adder cells is

organized by structure. It may consist of either ripple carry adders (RCAs) and carry save adders (CSAs. For N(x) N multiplication RCAs based multiplier needs 3N adders and takes 2N+1 adders delay in the worst case. However CSAs based multiplier needs 3Nadders to perform multiplication but takes N + 2 adders delay in the worst case. In CSA based multiplier, carry has to be propagate from (j-1) throw to jth row and then (j+1) th row.


Braun Multiplier is also known as CSA based parallel array multiplier. The limitation of the Braun multiplier is its logical architecture leads to more power consumption and hardware cost. Architectural modification manages through power reduction for row bypassing, column bypassing, row and column bypassing and circuit level modification Improved column by passing is based on the concept of low power and high speed multiplier is proposed with lesser hardware cost.


The power consumed in multipliers is reduced. In synthesis and optimization tools are not commonly has an input data in hardware optimization cannot be exploited. Multiplication is required to decrease power consumption is that the main advantage of this paper.


The column by passing multiplier is designed as Fig 1 shows the modified FA cell. It requires one multiplexor and two three-state gates in this architecture. FA will be disabled if a = 0. The carry input of three-state gate (Ci1,j) cant be used for the reason of Braun multiplier

Fig.1 Modified FA cell for column bypass multiplier

The first rows for each FA have only two inputs. So the inputs of FA0, j are disabled when a = 0, so that its carry output bit may not be altered, J is fixed when all the 3- inputs of FA1 the output are prevented from altering the proposed 4×4 array structure column-bypassing multiplier. Bottom of the CSA array is needed to set the carry outputs to be 0. However the absolute results for inputs are disabled is not related to FAs. In the results for last-row of CSA adders can be done by adding AND gate.

Fig.2 Column bypassing multiplier


      In the 1-D FIR channel module is connected in the calculation. Multiplexers the outline contain for selecting the contribution for the result of duplicating and previous processed outcomes and yield in the reserve. On the off chance that the upper bits of info exist together with the labels in reserve the put away esteem are chosen by multiplexers

      Fig.3 SMT Based multiplier architecture

      To avoid the multiplication product shifting immediately to the nearest adder and provide moment to charge the cache data on the bus, in the back of the multipliers many registers located. In high and low bits multiplication results is added the product of the adder is necessry. The Non-separated multiplication processors output is equivalent to the ultimate result and the verification are done with Verilog HDL simulation.

      The main memory and the processor will exist in cache and the internal interface-multipliers are the only link of a cache in this architecture. Its design is simple because the cache is unidirectional. Improved cache architecture is related to the conventional mode land consists of Data- RAM, Tag-RAM, a controller and a valid bit. Tag-RAM is used to store tags and Data-RAM for multiplication result.

      Multiplication result stored in Data-RAM is loaded on the data bus. The product is saved in Data-RAM through the data bus. By the proposed replacement strategy the desirable value is decided. The cache is small in size

      and it is designed with full-associativity. Full-associativity is quite efficient. Data path does not contain in the famous primary cache cell which is SRAM cell and normally D- Latch is being treated in cache memory cells.



        With a specific end goal to assess the execution of low-power multiplier is to actualize the plan of TSMC 0.35 innovation. The execution of this outline with ordinary Braun multiplier and column bypassing multiplier is thought about.

        In Table I the power utilization have three outlines. The information examples are thought to be arbitrary and at the end of the day, the likelihood of 0 and 1 are both 0.5.

        The power is assessed by running HSPICE. The sizes of the three plans are recorded in Table II. In our plan, the zone overhead is around 20%, while the zone overheads of column bypassing multipliers are over 40%.


In SMT to a 1-D FIR channel is connected in this engineering is depicted in the past area. The consequences of the FIR channel are recorded in Table II. A similar sortcomponents are assembled in the table. The impact of SMT in the 1-D FIR channel, with respect to the annihilated and covered operation, is around 10 rates. Since the mind boggling engineering than the general direct FIR frame is chosen and the outcome indicates moderately high power dissemination because of extra components. The bit of the store to aggregate vitality scattering is likewise 9%.

Table.1 1-d 4-tap fir filter module


Digital multipliers are one among critical arithmetic units. Power utilization is the crucial factor to be considered in recent decades, many researches are focusing on low power architectures. The multiplier architectures presented in this paper can be used as general-purpose multipliers. Extensive power consumption simulations show that by taking advantage of the characteristics of the input data and the processing algorithm, the proposed architectures reduce power consumption. The above surveyed papers present the low power design along with other features such as lifetime enhancement, low area complexity.


  1. F. Najm, Transition density, a stochastic measure of activity in digital circuits, in Proc. 28th Design Automation Conf. pp. 644- 649, June 1991.

  2. I. S. Abu-Khater, A. Bellaouar, and M. Elma sry, Circuittechniques for CMOS low-power high- performancemultipliers, IEEE J. Solid-State Circuits, vol. 31, pp. 15351546, Oct. 1996.

  3. C. R. Baugh and B. A. Wooley, A twos complement parallelarray multiplication algorithm, IEEE Trans. Comput., vol. C-22, pp. 10451047, Dec. 1973.

  4. I. Daubeches, Ortho normal bases if compactly supported wavelets,Commun. Pure Appl. Math., vol. 41, pp. 909996, Nov. 1988.

  5. N. Sankarayya, K. Roy, and D. Bhattacharya, Algorithms for low power and high-speed FIR filter realization using differential coefficients,IEEE Trans. Circuits Syst. II, vol. 44, pp. 488497, June 1997.

  6. A. Goldovsky, B. Patel, M. Schulte, R. Kolagotla, H. Srinivas, and G.Burns, Design and implementation of a 16 by 16 low- power twos complement multiplier, in Proc. ISCAS 2000, vol. 5, Geneva, Switzerland, pp. 345348

  7. C. J. Nicol and P. Larsson, Low power multiplication for FIR filters, inProc. ISLPED 1997, pp. 7679.

  8. E. de Angel and E. E. Swartzlander, Jr., Low power parallel multipliers,inProc. Workshop VLSI Signal Processing, IX, 1996, pp.199208.

  9. T. S. Chang and C. W. Jen, Low power FIR filter realizations with differential coefficients and inputs, in Proc. ICASSP, 1998, pp.30093012.

  10. S. Ramprasad, N. R. Shanbhag, and I. N. Hajj, Decorrelating (DECOR)transforms for low-power digital filters, IEEE Trans. Circuits Syst. II,vol. 46, pp. 776788, June 1999.

  11. M. S. Elrabaa, I. S. Abu-Khater, and M. I. Elmasry, Advanced Low-Power Digital Circuit Techniques. Norwell, MA: Kluwer, 1997.

  12. I. Koren, Computer Arithmetic Algorithms. Englewood Cliffs, NJ:Prentice-Hall, 1993.

  13. S. K. Paek, H. K. Jeon, and L. S. Kim, Semi-recursive VLSI architecture for two dimensional discrete wavelet transform, in Proc. ISCAS1998, vol. 5, pp. 469472.

  14. C. R. Baugh and B. A.Wooley, A twos complement parallel array multiplication algorithm, IEEE Trans. Comput., vol. C-22, pp. 10451047, Dec. 1973.

  15. A. Wu, High performance adder cell for low power pipelined multiplier, in Proc. IEEE Int. Symp. on Circuits and Systems, vol. 4 , pp. 57-60, May 1996.

  16. C. P. Lerouge, P. Girard, and J. Colardelle, A fast 16-bitNMOS parallel multiplier, IEEE J. Solid-State Circuits, vol.SC-19, pp. 338342, Mar. 1984.

  17. S. Mahant-Shetti, P. Balsara, and C. Lemonds, HighPerformance Low Power Array Multiplier Using TemporalTiling, IEEE Trans. VLSI Systems. pp. 121-124, Mar. 1999.

  18. A. Fayed and M. A. Bayoumi. A novel architecture for low- power design of parallel multipliers, in Proc. IEEE Workshop on VLSI, pp. 149154, 2001.

  19. R.Subalakshmi,Low power multipliers for DCT applications, International Journal of Science, Engineering and Technology Research, ISSN No: 2278-7798, Vol.6, Issue.2, pp.229-232,2017.

  20. R.Subalakshmi, Low Power VLSI Design Trends, International Journal of Advanced Research in Basic Engineering Sciences and Technology, Vol.3, Special Issue:24, pp.335-344,2017

  21. C.Vivek, R.Subalakshmi, Design of Low-Power Specific Parallel Array Multipliers, Journal of Chemical and Pharmaceutical Sciences, ISSN No: 0974-2115,Special Issue:8,pp.9-13,2016


Leave a Reply