Design and Verification of Scalable, Re-usable 16-Point IFFT Core for DSP Engine

— Due to the advancements in the Digital Signal Processing applications, the magnitude of the intensity of both Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (IFFT) have also increased. FFT and IFFT are the primary blocks in building almost all the Digital Signal Processing Systems. They are said to be the counterparts, but tend to function in reverse mode. In order to reduce the mathematical operations, computation complexity, these Fourier Transforms are used, thereby providing the feasibility for the hardware implementation of the same. This paper proposes the design and verification of 16-Point Decimation-In-Time (DIT) Inverse FFT for both complex and floating point numbers. Experimental results show that the proposed 16-point IFFT architecture incorporating approximate Radix-2 Butterfly module achieves good efficiency and overall high precision results.


INTRODUCTION
Inverse FFT is widely used Digital Signal Processing function with high computation complexity. Inverse Fast Fourier transforms (IFFT) performs the inverse of FFT, it converts a frequency domain function into time domain function. It primarily finds its applications in audio, video processing, filtering. Traditionally IFFT processors can be implemented on DSP or an ASIC. With the invention of FPGAs makes it possible to combine both the features of DSP and an ASIC for the implementation of IFFT processors. The significance of the IFFT and its functions are discussed [I].
In this paper, the implementation of IFFT block utilizes the Decimation-In-Time (DIT) Radix-2 Butterfly unit, which has complex adders, subtractor and complex multipliers, all these are further integrated. Radix-2 Butterfly unit plays an evident role, which is also considered to be the heart of any FFT or IFFT processing algorithms [II]. Henceforth the working of Butterfly unit is predominant in the IFFT processor. The proposed work includes the implementation of 16-point IFFT for complex numbers as well as the floating point numbers, with their verification.

A. General IFFT representation
The basic principle behind the IFFT algorithm is to break down input sequence of length N into smaller sequences. Let x(k) be an N-point sequence , where N is raised to the power of 2. IDFT x(n) of an N-point sequence can be mathematically given as follows: where, X(k) =Frequency domain samples X(n)=Time domain samples N=FFT size K=0, 1, 2, ………, N-1 The exponential term given in Equation (1) represents the twiddle factor required for IFFT computations.

B. Proposed Solution
In order to cater to the need for precision in a higher order IFFT block, the initial step to be taken is the bit reversal of the inputs (convert to IEEE 754 format) given to the IFFT block, followed by complex addition, complex multiplication and complex subtraction. Floating point numbers are already said to be in IEEE 754 format and doesn't need any conversion or change. The floating point number can be represented by taking significand which is scaled using exponent, base for scaling used in this paper is 2.

C. Organization of the Paper
Section II of the paper deals with the system Integration. It gives a brief introduction about the approach adopted in the design. Section III of the paper emphasizes on the implementation of proposed work. Section IV of the paper deals with the presentation of the obtained results. Section V of the paper illustrates the conclusion and the future scope of the proposed design.

II. SYSTEM INTEGRATION
The major operating block of IFFT block is a butterfly block. To design a 16-Point IFFT block we also need to design a 8-Point, 4-Point, 2-Point butterfly wherein 2-point is the basic block. All these sub blocks are integrated together. The basic Radix-2 butterfly unit is as shown in the Fig.1. Ar, Aj, Br, Bj are the real and imaginary parts of the inputs a, b respectively. Also C and D are outputs with their real and imaginary parts as shown in Eqs 3 and 4. In Fig. 1 a and b indicate the complex input from preceding stage while C and D indicate the complex output of the present stage (or complex input to the subsequent stage). The twiddle factors WN are defined as the co-efficients which are used to compute results from the preceding stage and to get inputs to the next stages of IFFT algorithm. The only difference between the butterfly of FFT and IFFT is the position of the twiddle factor which can be seen from the Fig.1.

III. IMPLEMENTATION
This section of the paper explains the algorithms used in the implementation of 16-Point IFFT taking account of all the stages required to design or build up an IFFT system. The 16point IFFT architecture is shown in Fig.2. The following aspects are considered in implementing the proposed design.
• Bit reversal is performed at the input sequence so as to improve the speed of the computations. • It has the four stages, in each stage, complex addition, subtraction and multiplication are performed for complex values. • This design also has IEEE 754 single format floating point numbers, so the floating point add, subtract and multiplication are also required.
The RTL schematic for 16-point IFFT for complex numbers is as shown below in Fig.3 and Fig.4.

IV. RESULTS
The inputs given to the IFFT block is depicted in Fig.5. The simulation results for the16-point IFFT is shown in Fig.6 The verification is done using Modelsim and the resultant values are shown in figures [7] and [8].  The technology schematic of the proposed work is as shown below in the Fig. 11. Fig.11. Technology Schematic of 16-point IFFT V. CONCLUSION AND FUTURE SCOPE This paper shows that the 16-point IFFT block for complex numbers as well as floating point numbers can be efficiently implemented with very good accuracy by making use of radix-2 butterfly architecture. The results have been extracted. The estimated area was found to be 61% and delay of 42.938ns. Since the design is made scalable and re-usable, the future scope of this work is to design for N-point IFFT and compare the computations for complex and floating point numbers, area and delay associated with it.