Approximate Multiplier based on Modified 7,3 and 15,4 counters using Split Logic and Sorting Network

Download Full-Text PDF Cite this Publication

Text Only Version

Approximate Multiplier based on Modified 7,3 and 15,4 counters using Split Logic and Sorting Network

Aswathi K C

Electronics and Communication Engineering Department IES College of Engineering


Suvitha P S

Assistant Professor

Electronics and Communication Engineering Department IES College of Engineering


AbstractThe summation of multiple operands is widely employed in various digital signal processing (DSP) units and constitutes an element of the critical path. Performance is often the key concern while designing a Multiplier. In this paper a unique method of fast saturated binary counters and compressors supported the sorting network and split logic is proposed . The inputs of the counter are asymmetrically divided into two groups and fed into sorting networks to come up with reordered sequences, which might be solely represented by one- hot code sequences. Using the above method, Modified (7,3) counter and (15,4) counters are designed using sorting network with less area. A (15,4) and a (7,3) exact counters with split logic which also utilizes the 4:2 exact compressor within the LSB part is proposed. By implementing these counters a high speed approximate 16×16 Multiplier is proposed which is utilizing the precise 4:2 compressor within the LSB a part of this multiplier. It minimizes the area and power which is suitable for error tolerant applications. Further, for final addition of multiplier, parallel prefix adder supported Kogge-stone adder is proposed for minimizing the critical path delay. The proposed work ends up in better areadelay product and powerdelay product in comparison to existing approximate multiplier.

  • Keywords Counters, Approximate/Exact compressors, Sorting Network, Multipliers


      Today's IT equipment is requiring more energy efficient integrated circuits (ICs) because the amount of information to process is increasing and also the size is getting smaller. Energy efficiency may be a product of power consumption and time consumed. to enhance energy efficiency in ICs, both power consumption and computing time must be reduced. However, many techniques for reducing power consumption have a trade-off relationship with performance. one amongst the techniques to resolve this problem is approximate computing [1].

      Approximate computing could be a technique that reduce power consumption and improve performance by reducing accuracy. Especially in applications that use human senses, it's suitable to use approximate computation because people don't recognize small errors.

      Approximate computing could be a important and useful method to style for energy efficiency. This has emerged as a possible solution for the look of energy-efficient digital systems. Applications like multimedia, recognition and data processing are inherently error-tolerant and don't require an

      ideal accuracy in computation. For these applications, approximate circuits may play a vital role as a promising alternative for reducing area, power and delay in digital systems which will tolerate some loss of accuracy, thereby achieving better performance in energy efficiency .The most famous method of multiple operands summation is that the Wallace tree structure .

      A Basic Wallace multiplier uses full adders and half adders to cut back the partial product tree to 2 rows, then a final adder is employed to feature these two rows of partial products. Its improved method reduced Wallace tree [2]. These methods use full adders as (3,2) counters to accelerate the summation, leading to logarithmic time consumption. This sort of structure is additionally called carrysave structure. Since then, many papers have discussed the way to construct a more time-efficient structure to accelerate the summation, like [3][4]. In [5] high compression ratio achieving counters and compressors are constructed.

      Multipliers have three phases – generation of partial products, reduction of partial products and last addition. Reductions of partial products take much time and power within the multiplier. Many techniques were proposed to cut back the critical path within the multiplier [6]-[7]. Among them, the employment of compressors in partial product reduction stage is that the hottest.

      Compressors are basic circuits which are fabricated from full adders or half adders to count the quantity of ones within the input. Several compressors are required within the partial product reduction stage. Various compressors like 3-2, 4-2, 5-2 and 5-3 were proposed by researchers within the last 20 years [6]-[9].These are useful only if the scale of multiplier is tiny.

      A 16×16, 32×32 bit multipliers require large size of compressors. High order compressors provide better leads to terms of power and speed. Fast binary counters and compressors are generated by sorting networks are the present technology which might provides a higher compression ratio than the traditional small value compressors. But it consumes more area than low order compressors. of these techniques perform the precise computation and modules produce the right result. Accuracy of the module/device is often 100% in exact computing. But exact computing has one major drawback. it's unattainable to optimize all the parameters of the circuit in exact computing. However, exact computing isn't essential for each application. There are some applications like

      image processing and multimedia can tolerate errors and supply meaningful results. Inexact (approximate) computing techniques became popular thanks to its. Low complexity and fewer power consumption.

      Fig. 1 Existing 7,3 counter based on Sorting Network

      Fig. 2. Existing 15,4 counter based on Sorting Network

      Inexact computing produces reasonable result, even its low accuracy. In approximate computing, the worth of error rate (ER), error distance (ED) and normalized error distance (NED)play a very important role to calculate the ultimate output . Several approximation techniques were proposed for adders and multipliers ,From central point to the foremost significant bit (MSB) is named accurate and to the smallest amount significant bit (LSB) is termed inaccurate a part of adders. Inaccurate computing in MSB side causes large error.


      The sorting network is an efficient parallel hardware network utilized for data sorting. The famous 0,1 principle shows that, if a sorting network can sort a group of data whose elements are all 1-bit numbers, it can sort all types of numbers. In this article, we only adopt it for 1-bit data sorting .

      1. Sorting Network Working Principle

        The typical three- and four-way sorting networks are shown in Fig. 2. Each vertical line represents a sorter

        Fig. 3 Three and four way sorting network

        Fig.4 Two input binary sorter

        that has two data inputs and two data outputs, and all data are 1 bit numbers. The sorter always puts the larger input up, the smaller one down. In Fig. 3, we give an input example: sequence [0, 1, 1, 1] represents the input of four-

        way sorting network (4 SN), and sequence [0, 1, 1] represents the input of three-way sorting network (3 SN). For both 4 SN and 3 SN, the input sequences are reordered in the form of the larger number at the top and the smaller number at the bottom after three layers of sorter.

      2. Sorter for 1 -Bit Data

      As mentioned above, the sorter reorders two inputs according to numerical magnitudes. As for two 1-bit data. the logical circuit illustrated in Fig. 4 can sort themeasily. This means that a sorter consumes one layer of two-input basic logic gates, and the three- and four-way sorting networks both consume three layers of two-input basic logic gates..


      In this proposed counter design, the 4SN of figure replaced by 4:2 exact compressor and a full adder replaces the 3SN.x0,x1,x2,x3 inputs are giving to a 4:2 exact compressor which utilizing the sorting network techniques for its working,x4,x5 and x6 are giving to a full adder. The outputs of the counters 4:2compressor are s2,c2 and output of full adder is s1, c1.

      Fig.5 Block diagram of proposed 7,3 counter


      Fig.6 Overall structure of proposed 7,3 counter


      This modified 15,4 counter replaces the 8 SN of figure with two 4:2 exact compressors and also replaces a 7SN with one 4:2 exact compressor and with a full adder. And the outputs of these are c4,s4,c3,s3,c2,s2 and c1,s1 as shown in the figure. For further reduction again the outputs are given to another two exact compressors as the carry outputs are together given to one of the compressor and the sum outputs are together given to the second compressor and finally we will get the output c6,s6 and c5,s5

      Fig.5 Block diagram of proposed 15,4 counter

      Fig.6 Overall structure of proposed 15,4counter.

      Fig.7 Architecture of proposed Multiplier

    5. PROPOSED MULTIPLIER Implementation of multiplier comprises three steps:

      generation of partial products, partial products reduction tree,

      and finally, a vector merge addition to produce final product from the sum and carry rows generated from the reduction tree. Second step consumes more power In this brief, approximation is applied in reduction tree stage .A 16×16 multiplier is used for illustration to describe the proposed method . Approximation methodologies were applied in generating the partial product phase .A modified (7,3) exact counter and a modified (15,4) exact counter are used for partial product reductions. Approximation is giving to the LSB parts of the Multiplier.

      The approximation techniques were proposed within the partial product reduction stage. Accurate multiplier was utilized in MSB side of the multiplier. No multipliers was utilized in LSB side where approximation rule was applied. In this paper, Wallace tree multiplier structure is modified by using parallel prefix adders to add partial products in final phase addition process to obtain final product. The reason behind to use PPA in place of full adders is to improve the speed of operation. In PPA, the carry input for the next bits is generated at a time with the help of parallel prefix carry tree which consists of black cells and grey cells. There are many types of PPAs are present whose basic design idea is originated from carry look ahead adder. Kogge Stone adder is very attractive for high-speed applications which come at the cost of high area and more power.


      1. Simulated result of proposed 7,3 counter

        To judge the performance of the designed (7,3) counter, we construct it with Verilog HDL. The proposed design has over 10% better performance than other designs, while it's significantly less area and power consumption.

      2. Simulated result of proposed 7,3 counter

      To evaluate (15,4) counters performance, we use the(15,4) counter in[5] and (7,3) counters in [5]. The new (15,4) counter performs over 15% better in time delay, but the world advantage isn't as large because the (7,3) counter. However, it should be noted that, if the delay constraints are relaxed slightly, the realm of the proposed design will drop rapidly. When the proposed counter is synthesized with the same delay constraints as in [5] the proposed design will have less area than [5]. Thus, this new design is more flexible that can be utilized in high- performance or compact scenarios, and in both scenarios, it has a better performance. The proposed design also has more than 40%power consumption.

      Fig.8 Simulated result of proposed( 7,3)counter

      Fig.9 Simulated result of proposed (15,4)counter

      Fig.10 Synthesis result of proposed (7,3)counter

      Fig.11 Synthesis result of proposed (15,4)counter


In this article, a new counter design method based on a sorting network is proposed. We proposed a modified (7,3),and a (15,4) counters. The (7,3) counter has 8.l% 27.0% less delay than other designs and also consumes less area and power. The (15,4) counter is more flexible than existing designs because it achieves 14.9%35.2%less delay when the speed is critical and performs 14.7%49.% and 41%72.7% better in ADP and PDP when the world or power is critical. once they are embedded in a very 16-bit multiplier, the multiplier achieves 31.8% and 32.2%c better in ADP and PDP in maximum than those embedded in other counter designs. Approximate multipliers provide better performance than accurate multipliers with compromising of error rate.


[1] J. Han and M. Orshansky, Approximate computing: an emerging paradigm for energy-efficient design, in Proc. ETS, pp. 1-6, May 2013

[2] R. S. Waters and E. E. Swartzlander, A reduced complexity wallace multiplier reduction, IEEE Trans. Comput., vol. 59, no. S, pp. I 1341137, Aug. 2010, doi: 10.I 109/TC.2010.1 03

[3] S. Asif and Y. Kong, Analysis of different architectures of counter based wallace multipliers, in Proc. 10th Int. Conf. Com- pit. Eng. Syet. (ICCES), Cairo, Egypt, Dec. 201 J, pp. I 39144, dot: I 0.1 I 09/ICCES.2015.7393034.

[4] Q. Jiang and S. Li, A design of manually optimized (15,4) parallel counter, in Proc. Int. Conf. Electron Devices Solid- State Circuits (EDSSC), Hsinchu, Taiwan, Oct. 2017, pp. I 2, doi: I 0.1 I 09/EDSSC.2017 .5126527.

[5] Wenbo Guo and Shuguo Li,Fast Binary Counters and Compressors Generated by Sorting Networkin IEEE Transactions on Very Large Scale Integration (VLSI) Systems ( Volume: 29, Issue: 6, June 2021 doi: 10.1109/TVLSI.2021.3067010

[6] Pishvaie, G. Jaberipur, and A. Jahanian, Improved CMOS (4; 2 compressor designs for parallel multipliers, Computers and Electrical Engineering, vol. 38, no. 6, pp. 17031716, Nov. 2012.

[7] D. Baran, M. Aktan, and V.G. Oklobdzija V.G, Energy Efficient implementatio of Parallel CMOS Multipliers with Improved Compressors,in proc. ACM/IEEE International Symposium on Low-

Power Electronics and Design (ISLPED), Aug. 2010, pp. 147-152

[8] S. Veeramachaneni, K. Krishna M, L. Avinash, S. R. Puppala, and M.B.Srinivas, Novel Architectures for High-Speed and Low-Power 3- 2, 4-2 and 5-2 Compressors, in proc. of International Conference on VLSIDesign (VLSID), Jan. 2007, pp. 324-329.

[9] O. Kwan, K. Nawka, and E. Swartzlander Jr, A 16 bit by 16 bit MAC Design Using Fast 5:3 Compressor Cells, Journal of VLSI Signal Processing, vol. 31, no. 2, pp. 77-89, July 2002.

Leave a Reply

Your email address will not be published.