 Open Access
 Total Downloads : 585
 Authors : Dedeepya Nutakki, Nagaraju Ravada, N.V.G.Prasad
 Paper ID : IJERTV1IS6471
 Volume & Issue : Volume 01, Issue 06 (August 2012)
 Published (First Online): 30082012
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Implementation of Random logic in the Quaternary FPGA
Dedeepya Nutakki#1 
Nagaraju Ravada*2 
N.V.G.Prasad#3 
M.Tech. Student 
Assistant Professor 
Associate professor 
#Dept of Electronics and Communication Engineering,
Sasi Institute Of Technology & Engineering, JNTU Kakinada, India.
Abstract
Field Programmable Gate Arrays (FPGA) are most preferred than ASICS for low non recurring engineering costs. But the interconnections in FPGA plays an important role in speed, area and power consumption as there are large number of wires and switches. Multivalued logic is such a technique which allows the reduction of number of signals in a device thus reducing the impact of interconnects. The proposed work is a new FPGA structure (quaternary instead of binary) which uses a voltage mode device with reduced wires in length, switches, fanout with small delay penalties. We propose a work to reduce these small delays by reducing the critical path delay using random logic. We choose direct from FIR filters as a demonstrator.
Keywords : Field programmable Gate Array (FPGA), FIR, Quaternary, multi valued logic,Random logic.

The high integration of modern systems increases the number and length of interconnections, hence the overall complexity involving the connections of these systems. Moreover with the advent of deep submicron technologies, interconnections are becoming the dominant metric of the circuit delay. On the other hand, the interconnection resistancecapacitance product increases with the technology node, leading to an increase of network delay.
Interconnections play an even more crucial role in Field Programmable Gate Arrays (FPGA).Recent works suggest that in modern milliongates
FPGAs, as much as 90% of chip area is dedicated to interconnections .If one could reduce the FPGA area without losing logic capabilities one could enhance the yield and reduce prices, or even increase the amount of memory available inside the FPGA. To reduce the area of the FPGA, a reduction in the interconnection is mandatory, since interconnections take large amount of area.
Multiplevalued logic (MVL) has received increased attention in the last decades because of the possibility to represent the information with more than two discrete levels. Representing data in a MVL system is more effective than the binary based representation, because the number of interconnections can be significantly reduced.
Recently, a voltagemode MVL technique was proposed dealing specifically with the power dissipation problem using a standard CMOS process, and still maintaining the logic compaction allowed by MVL. The proposed circuits intend to reduce the number of interconnections present in existing binarybased systems, without incurring on power consumption penalties. The benefits of this new MVL implementation technique were considered for application in the reconfigurable domain.
A new lookup table (LUT) structure was proposed where the information is represented by quaternary values. In this contribution we show the first steps to tackle the challenge of low power and high density FPGAs, by proposing a quaternary CLB. By using quaternary connections one is able to reduce the number of wires and switches, thus reducing area, power consuming part of current FPGAs. Also by using random
logic one can increase frequency. A complete arithmeticoriented CLB is proposed, in which any logic operation can be implemented through a quaternary LUT. A fast carry ahead propagation unit and a register are also presented in the proposed CLB.
As case study to validate the proposed CLB we have used a digital signal processing application focusing on Finite Impulse Filters (FIR) filters. Only adders/subtractions and shift operations are Used in synthesis of the filters.
This paper is organized as follows. Section 2 discusses the differences between binary and quaternary implementations of lookup tables. Section 3 presents the n quaternary FPGA, gives details about the new arithmeticoriented logic block and presents comparisons with the binary version. Section 4 discusses FIR filter implementations using Multiple Constant Multiplications (MCM) as the case study adopted in this work and exemplifies how filters are deployed in the proposed FPGA structure. Experimental results are presented in Section 5. Finally, Section 6 concludes the paper and outlines future work.

General Lookup Tables (LUT) are basically memories, which implement a given logic function. Values are initially stored in the lookup table structure, and once inputs are applied, the logic value in the addressed position is assigned to the output. A 4input binary lookup table with one output is able to store 16 Boolean values. For the purpose of this work, only 1output LUTs (n = 1) are discussed in this paper.

Preliminaries
A binary function implemented by a Binary Lookup Table (BLUT) is defined as : a lookup table with 4 inputs in fig (a) can implement one of 65 536 different functions. In the case, where b = 4 the number of functions that can be represented is around 4 X109 for a QLUT with only two inputs, which is much larger than the BLUT. Figure (b) illustrates a 2input quaternary function implemented in a QLUT. Note that the function g(Y ) performs exactly the same function as the two binary BLUTs, f0(Y ) and f1(Y ), as depicted in Figure 1c, where f0 represents the least
significant Boolean values and f1 represents the most significant ones.
Figure 1 Binary (BLUT) And Quaternary (QLUT) Lookup Tables
And The Quaternary Function

LUTs Implementation
Binary and quaternary lookup tables were implemented by a set of multiplexers, such as illustrated in Fig 2.The BLUT is composed of four stages as a consequence of the number of inputs. Multiplexers are responsible for propagating configuration values to the BLUT output. The multiplexers are composed of pass gates, which receive selection signals from the four BLUT inputs and associated inverters.
Figure 2(a) 4input BLUT
A quaternary lookup table follows the same structure as the BLUTs. However, Down Literal Circuits (DLCs) structures determine which configuration value must be propagated to the output .Fig b illustrates the implementation of a 2 input QLUT (b = 4; Y = k = 2; C = 16). Due to the quaternary representation, each multiplexers has four configuration inputs, therefore only two multiplexers stages are required.
Figure 2(b) 2input QLUT
The DLCs (Gray triangles 1; 2 and 3 in Figure) have structures similar to inverters (with 1 PMOS and 1 NMOS transistor). Transistors in each DLC circuit have modified Vth values in order to allow the switching at different input voltages. This way, the 3 DLCs circuits work as a thermometer system. The DLC output values are only 0 (GND) or 3 (VDD), according to the logic value applied to their inputs. Table below shows the DLC output logic values as function of the inputs.
Table 1
Down Literal Circuits (DLCS) Behaviour According To The Logic Value At The Input


In general, FPGAs are basically sets of programmable Configurable Logic Units (CLBs) and interconnections. The CLBs contain LUTs to implement logic and storage elements .CLBs in the Xilinx Spartan3 FPGA family are composed by two independent groups of two slices. A Slice is a logic/storage unit. The routing among logic blocks are performed through programmable switch matrices. he group of switch matrix and CLBs is called a tile.

The FPGA Logic blocks
In this work we propose a new quaternary logic block targeting arithmetic functions. Fig 3 illustrates the structure of binary and quaternary logic blocks of the FPGA configured to implement the sum of variables X and Y. The binary logic block (Fig 3a) represents two slices of the Xilinx Spartan3 FPGAs.
Carry lookahead is implemented by propagating the carry signal through two multiplexers from Cin to Cout. The carry propagation signal is define by a XOR function implemented by the BLUT. Otherwise, the carry is generated as one of the inputs.
Figure 3(a) Binary Logic Block
We developed the quaternary logic block following the same idea, but considering quaternary functions (Fig3b). The QLUT implements functions of 2 variables as a generalization of the 4input binary LUT. Table II shows the signal S, implementing the sum of X and Y, and the Cout as function of the inputs X, Y, S and Cin.
Figure 3(b) Quaternary Logic Block
The carry propagation/generation in the quaternary element is defined by a modified multiplexer, in such a way that Cout is a function of the input signals X, Y and the QLUT output signal as well. The Quaternary Carry Propagation (QCP) logic is illustrated in Fig 4 and implements the Cout function.
The QCP logic is divided in two parts. The first part is the carry propagation detection (i:e: generation of the function Cout = Cin). Thus, the same DLC 3 used in the QLUT (Fig2b) is used in the QLC to generate the signal S3.S3 enables the propagation of the carry whenever the QLUT output S=3, which implies S3 =0. See Table I for further details.
Figure 4 Quaternary Carry Propagation (QCP) Logic

Interconnections
The FPGA structure is composed by a fully programmable network connecting CLBs, IOs, and other FPGA components. In order to increase the efficacy of the FPGA routing, four types of interconnects are present in the Xilinx Spartan3 FPGAs: long lines, hex lines, double lines and direct lines. We model the FPGA interconnections as distributed RC networks on the Predictive Technology Model (PTM) parameters.


Table 2
The QLUT output S and Cout functions
The carry generation is defined by S3 = 0 and one of other two conditions K1 & K2. K1 & K2 are generated by quaternary logic gates. These conditions determine Cout =1 or Cout =0. First, K1 defines Cout =1 when X =3 or Y =3 and second, K2 defines Cout =1 when X >_2 and Y
>_2. Otherwise, Cout =0.A Dtype Flipflop (FF) is also presented in the quaternary logic block. The FF is composed by quaternary inverters.
In several computationally intensive operations, notably Finite Impulse Response (FIR) filters, the same input is multiplied by a set of constant coefficients. This operation is called Multiple Constant Multiplications (MCM). MCMs are commonly used in Digital Signal Processing (DSP) applications and are an important choice for reduce the power consumption due to the high level of sharing of operations and the possibility to implement multiplications by using only adders, subtractions and shifts.
Fig below illustrates the implementation of a filter with 4 taps, in which the sharing of partial terms can be verified. The input x is multiplied by the constants 117, 100, 13 and 36.
Figure 5(a)
Figure 5(b) Filter With 4 Taps
For each set of constant coefficients there are a wide range of possible mapping solutions. In this example, instead of using two adders per coefficient (Figure 5a), the adder that generates the value 3x is shared in order to reduce the number of adders.
The placement & routing of the filters is very simple to implement in FPGAs. Operators are placed in the CLB columns in order to take advantage of the fast carry lookahead chain. Horizontally, CLBs are placed according to the succession of operators.
Our experiments were realized with some filters with 8bit random coefficients. Results are obtained through Xilinx Synthesis Tool. Results show an important reduction of 16% on power consumption (PWR) with no penalty on timing (Freq).
The operation frequency is not much slower in the quaternary implementation due to the number of CLBs in the critical path. In binary implementations of the filters, the number of bits may increase only by one from one adder to the next one. Hence, only a slice (not a complete CLB) is inserted in the critical path. For the quaternary version, the critical path is increased by the delay of the full CLB, because it cannot be separated in two as in the binary case.
Circuits present important gains due to the smaller bus width, but also because shift operations can be performed with reduced vertical connections. This way, the overall performance can be increased, since less switches will be present in the critical path.
This work presents important advances on the development of multivalued circuits through the implementation of a transistor level arithmetic oriented quaternary FPGA structure. Results show that the proposed quaternary FPGA is competitive with the binary one because of the important reductions on the connection sizes and number of switches and its effects on the power consumption and circuit performance.
In this paper we have successfully shown that significant power reduction and increased frequency can be achieved by a quaternary device. Increased frequency is obtained by implementing random logic in the quaternary LUT due to the possibility to reduce the number of CLBs without increasing the number of CLBs in the critical path. The quaternary representation applied to the random logic will allow, not only the reduction of the number and size of the connections, but most important, the reduction of the fan out and the load applied to the logic blocks.
The author would like to thank Sasi Institute Of Technology and Engineering, Ashwan Kumar Karrolla of cedronics and reviewers.

A. K. Gupta and W. J. Dally, Topology optimization of interconnection networks, IEEE Comput. Archit. Lett., vol. 5, no. 1, p. 3, 2006.

K. Banerjee, S. Souri, P. Kapur, and K. Saraswat, 3D ICs: a novel chip design for improving deepsub micrometer interconnect performance and systemson chip integration, Proceedings of the IEEE, vol. 89, no. 5,pp. 602633, May 2001.

F. Li, Y. Lin, L. He, D. Chen, and J. Cong, Power modeling and characteristics of field programmable gate arrays, IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 24,no. 11, pp. 17121724, Nov. 2005.

A. Singh and M. MarekSadowska, Efficient circuit clustering for area and power reduction in FPGAs, in FPGA 02: Proceedings of the 2002 ACM/SIGDA tenth international symposium on Fieldprogrammable gate arrays. New York, NY, USA: ACM, 2002, pp. 5966.

R. da Silva, C. Lazzari, H. Boudinov, and L. Carro, CMOS voltagemode quaternary lookup tables for multivalued FPGAs, Microelectronics Journal, vol. 40, no. 10, pp. 1466 1470, 2009.

E. Dubrova, Multiplevalued logic in vlsi: challenges and opportunities, in Proceedings of NORCHIP99, 1999, pp. 340350.

T.S. Jung, Y.J. Choi, K.D. Suh, B.H. Suh, J.K. Kim, Y.H. Lim,Y.N. Koh, J.W. Park, K.J. Lee, J.H. Park, K.T. Park, J.R. Kim,J.H. Yi, and H.K. Lim, A 117mm2 3.3v only 128mb multilevel NAND flash memory for mass storage applications, IEEE Journal of
SolidState Circuits, vol. 31, no. 11, pp. 15751583,Nov 1996.

A. Gonzalez and P. Mazumder, Multiplevalued signed digit adder using negative differential resistance devices, IEEE Transactions on Computers, vol. 47, no. 9, pp. 947959, Sep 1998.

T. Hanyu and M. Kameyama, A 200 MHz pipelined multiplier using 1.5 vsupply multiplevalued mos currentmode circuits with dualrail sourcecoupled logic, IEEE Journal of SolidSate Circuits, vol. 30,no. 11, pp. 12391245, Nov 1995.

Z. Zilic and Z. Vranesic, Multiplevalued logic in FPGAs, Aug 1993,pp. 15531556 vol.2.

R. Cunha, H. Boudinov, and L. Carro, A novel voltagemode cmos quaternary logic design, IEEE Transactions on Electron Devices, vol. 53,no. 6, pp. 14801483, June 2006.

L. Aksoy, E. da Costa, P. Flores, and J. Monteiro, Exact and approximate algorithms for the optimization of area and delay in multiple constant multiplications, ComputerAided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 27, no. 6, pp. 013 1026, June 2008.

W. Zhao and Y. Cao, New generation of predictive technology model for sub45nm design exploration, International Symposium on Quality Electronic Design, pp. 585590, 2006.

Xilinx Inc., Spartan3 fpga family data sheet, 2008. [Online].
Available:http://www.xilinx.com/support/documentatio n/datasheets/ds099.pdf

T. Sakurai, Closedform expressions for interconnection delay, coupling, and crosstalk in vlsi s, Electron Devices, IEEE Transactions on, vol. 40, no. 1, pp. 118124, Jan 1993.

T. Mak, C. DAlessandro, P. Sedcole, P. Y. K. Cheung, A. Yakovlev, and W. Luk, Global interconnections in fpgas: modeling and performance analysis, in SLIP 08: Proceedings of the 2008 intern workshop on System level interconnect prediction. New York, NY, USA: ACM, 2008, pp. 5158.

Cristianolazzari,Paulo flores , Jose Monteiro and Luigi Carro,a new quaternary FPGA based on voltage mode Multivalued circuit, 2010 EDAA