 Open Access
 Total Downloads : 1039
 Authors : Reshma Janardhan
 Paper ID : IJERTV1IS10284
 Volume & Issue : Volume 01, Issue 10 (December 2012)
 Published (First Online): 28122012
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
RNS And Adders Based Radix8 2^{n} 1 Modulo Multiplier
Reshma Janardhan
M.Tech Student, LIET, Hyderabad, AP ,India
Abstract
The modulo 2n1 multiplier is usually the non critical data path among all modulo multipliers in such highDR RNS multiplier. This timing slack can be exploited to reduce the system area and power consumption without compromising the system performance. Several commercial processors have selected the radix8 multiplier architecture to increase their speed, thereby reducing the number of partial products. Radix8 encoding reduces the digit number length in a signed digit representation. Its performance bottleneck is the generation of the term 3X, also referred to as hard multiple. The modified booth encoder will reduce the number of partial products generated by a factor of 2 using radix4 but by using radix8 the partial products reduces by a factor of n/3.
A special moduli set of forms {2n1, 2n, 2n
+1} are preferred over the generic moduli due to the ease of hardware implementation of modulo arithmetic functions as well as systemlevel inter modulo operations, such as RNStobinary conversion and sign detections. To facilitate design of highspeed fulladder based modulo arithmetic units, it is worthwhile to keep the moduli of a high DR RNS in forms of {2n1, 2n, 2n +1}.The modulo 2n1 multiplier is usually the non critical data path among all modulo multipliers in such highDR RNS multiplier. With this precept, a family of radix8 Booth encoded modulo 2n1 multipliers, with delay adaptable to the RNS multiplier delay, is proposed. The modulo 2n1 multiplier delay is made scalable by controlling the wordlength of the ripple carry adder, employed for radix8 hard multiple generation.
Keywords Residue number system(RNS), RSA, Radix 8 booth encoding, moduli set, prefix structure.

Introduction
Two Algorithms that are mostly established in public Key cryptography are RSA (Rivest and Shamir) and ECC (Elliptical curve cryptography)[1] [3]. For these algorithms Encryption and decryptions are performed by modulo multiplications. During encryption and decryption process of RSA ECC key sizes ranges from 512 1024 bits and 106 512 bits[3] [6]. Due to this it becomes difficult to implement hardware because of Long carry propagation of large inter multiplies. To make it easy for hardware implementation Residue Number system has come into existence to design low power and faster multipliers.
RNS (Residue Number System) is based on generic moduli. But we are going to prepare special moduli form of 2n1 over generic moduli for the hardware implementation. There is triple moduli set
{2n1, 2n, 2n+1} And the speed of RNS processor of 2n1 is increasingly dominated by the residue arithmetic operation rather than one time forward and one time back ward or reverse conversion .[7]

Modulo 2n1 Multiplier Architecture
Modulo multiplier are used effectively for the area and time purpose. The design of 2n1 modulo multiplier consists of three as shown in figure 1, steps one is modulo particial product generation then second slip for addition of those partial products and the third step for the addition of carry and propagate bits of the adder. The fig. 1 represents the modulo multiplier architecture. The modulo multiplication is extensively used in Residue number system (RNS) based digital signal processing and cryptography units.
Figure 1: Modulo (2n1) Multiplier Architecture.


Review
Radix8 is the advanced technique than that of Radix4. In Radix 4 the multiplier the partial product generation the multiplicand is taken and is divided into digits denoted by Di where Di is one of
{2,1,0,1,2)
Radix 4 are used for generation of soft multiples shown in table 1, where as Radix 8 is used for generation of both hard and soft multiples as shown in table 4.
Table 1:Modulo Reduced Multiples For The Radix 4 Booth Encoding
Table 2: Radix 4 Recoding.
di
Signed values
Xi+2 Xi+1 Xi
0 0 0
0X
0 0 1
1X
0 1 0
1X
0 1 1
2X
1 0 0
2X
1 0 1
1X
1 1 0
1X
1 1 1
0
In Radix 4 booth algorithm it starts by appending a zero to the right of X0 and three bits are take once as an input and they are denoted by Di.
Example: 01000111multiplicand 01001011multiplier
The number 1X represent the same number, 0X represents all digits are zero and 2x represents circular left shift by one position. The negative symbol for ix i.e 1x means 2's complement & 2x means 2's complement and circular left shift by one position.
Di
di.X 2n
1
+0
0…..0
n
+1
X
+2
CLS(X,1)
2
CLS( X ,1)
1
X
0
1…..1
n
Whereas for Radix 8 for the particle product generation the 8 bit multiplicand is considered and a zero bit is added at MSB LSB of it.
Unlike the Radix 4 in Radix 8 we are grouping 4 bits each as one group and hence we obtain three partial products. Thus reduci8ng one of the partial product when compared with Radix4.
Table 3: Modulo Reduced Multiples For Radix 8 Booth Encoding.
Tables 4: Radix 8 Recoding

Proposed Radix8 Multiplier

Generation Of Hard Multiple
In RNS multiplier the carry propagation length in the hard multiple generation should not exceed 'n' bits so the carry propagation through the half adders as shown in figure 2 can be eliminated by making end around carry bit c7 a partially product to be accumulated in the CSA (carry save accumulator) tree in figure 3. so this technique reduces the carry propagation length to n bits as shown in figure below.
Figure 2: Generation Of Hard Multiple Architecture
2 1
+3X n
Figure 3: Modified Architecture Of Hard Multiplier Using K Bit RCA's
The carry out of RCA0 i.e.
0
c
3 is not
In Radix 8 we are having hard multiple generation i.e 3x. and this 3x is obtained by 2x +x
propagated to the input of RCA 1 but is taken as one of the partial product bit to be accumulated in CSA
i.e. circular left shift of the multiplicand added with
the multiplier. Here, the Di ranges between {4, 3, –
tree. The output of the RCA1 i.e.
1
c
3 is been
2, 1, 0, 1, 2, 3, 4}
The radix8 booth encoding reduces the partial products to (n/3)+1 from that of radix 4.
After obtaining the partial products from the multiplier and the multiplicand they are given it to adder. the result of 2n1 multiplier is the reduced output i.e we obtain x.y2n1. And in this project the maximum range of the output is of only 255 because its designed for only 8 bit number.
reduced before being accumulated. Therefore the partially redundant for of +3X2n1 is given by
……….(1)
And the partially redundant form of 3X2n1 is given by
………(2)
To avoid many long string of ones in C an appropriate bias B is added to hard multiple so that C
and C are sparse.
M 1
B 2
K . j
n
0….01…..0….01…………..(3)
j 0 k k
The block diagram for B+3X2n1 is as shown in figure 4 and its generated by simple XNOR, OR operations on the bit positions 2kj.
2 1
Figure 4: Generation Of Partial Redundant Of B+3X n

Generation Of Soft Multiples
Since the hard multiple can't be predicted at design time, all the multiples must be similarly represented. So the bias is added to the soft multiples also. It is represented as B+02n1, B+X2n1,
B+2X2n1, B+4X2n1 in the partial redundant form for 8bit. the figure 5 represents the simple multiples.
Figure 5: Generation Of Partially Redundant Of Simple Multiples

Generation Of Partial Product
To generate partial product it requires booth selector and booth encoder. to generate each partial product it requires one booth encoder and ten booth selector. in Radix 8 we can generate three partial products pp0 , pp1, pp2. the booth encoder produces signed one hot encoded digit from adjacent overlapping multiplier bits. The signed one hot encoded digit is then used to select the correct multiple to generate ppi. The block diagram of booth encoder and booth selector is as shown in figure 6.
Figure 6.a: Booth Encoder
Figure 6.b: Booth Selector
In booth encoder the selected inputs i.e Di is given and we obtain any one as output from X,2X,3X,4X with respect to its sign value and the outputs are then again given as input to booth selector.where as to booth selector we give inputs as output of booth encoder and also the Di bits and we obtain the output as the partial product ppij.In partial product generation we obtain PPij and qij values. Each PPij consists of an nbit vector, ppi7…….ppi1pp10 and a vector of n/k = 2 and redundancy carry bits qi1qi0 are the carryout bitsas in figure 7 and 8.After we obtain the partial product and redundancy carry bits we add compensation constant for the bias introduced in partial redundant representation
Figure 7: Reduced Partial Product Generation
Figure 8: Modulo Reduced Partial Product And CC For
2 1
X.Y n

Addition Of Partial Products
2 1
The obtained partial product are accumulated in the carry save adder and also the partial carry redundant bit along with bias is given to the parallel prefix adder as in figure 9. Thus the output obtained is X.Y n
Figure 9: modulo reduced partial product accumulation

Modification Of Adder
Now instead of using parallel prefix adder after accumulation we are using carry look ahead adder. So let us consider the case where the underlying adder is a onelevel carry look ahead adder The logic equation for such an adder are
Gi=AiBi
Pi= A i Bi + Ai B i =Ai Bi Ci =Gi+PiCi1
Si =Pi Ci1
Ci Gi+PiGi1+PiPi1Gi2+….PiPi1….P1G0Gn1PiPi1
…P0 +Pn1Gn2PiPi1……P0+Gi+1PiPi1….. P0Pn1…. Pi+2
when the output is obtained by parallel prefix adder the number of the gate count is less than that of CLA. where as instead of parallel prefix adder if we give input to the carry look ahead adder the no of gate count is less when compared to previous. The implies that the area is reduced in CLA adder than that of parallel prefix adder.


Result
Figure 10 : Multiplier Simulation Results
Figure 11:Radix8 Synthesis Report (Gate Count) using parallel prefix adder
Figure 12 :Radix8 Synthesis Report (Gate Count) using carry look ahead adder

Conclusion
A new approach for modulo multiplication of 2n1 has been approached and the partial products, hard and soft multiples are accomplished by the AND XNOR and OR gates. Thus a family of low area and low power modulo 2n1 multiplies was proposed The decay of the proposed multiplies is controlled by word length of the small parallel RAS which is used to compute hard multiples From synthesis result its Clearly shown that the area is used by CLA is less when compare to parallel prefix And also its useful for dynamic range for RNS multiplication.

Reference

R. Rivest, A. Shamir, and L.Adleman, A method for obtaining digital signatures and public key cryptosystems Commun. ACM, vol. 21, no.2, pp. 120126, Feb. 1978.

V. Miller, Use of elliptic curves in cryptography in Proc. Advancesin CryptologyCRYPTO85, Lecture Notes in Computer Science, 1986,vol. 218, pp. 417426.

N. Koblitz, Elliptic curve cryptosystems, Mathemat. of Comput., vol.48, no. 177, pp. 203209, Jan. 1987.

National Institute of Standards and Technology. Available: http:// csrc.nist.gov/publications/PubsSPs
.html

A. K. Lenstra and E. R. Verheul, Selecting cryptographic key sizes,J. Cryptol., vol. 14, no. 4, pp. 255293, Aug. 2001.

C. McIvor, M. McLoone, and J. V. McCanny, Modified Montgomery modular multiplication and RSA
exponentiation techniques, IEE Proc. Comput. and Dig. Techniq., vol. 151, no. 6, pp. 402408, Nov.2004.

D. J. Soudris, V. Paliouras, T. Stouraitis, and C. E. Goutis, A VLSI design methodology for RNS full adder based inner product architectures, IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol.44, no. 4, pp. 315318, Apr. 1997.