 Open Access
 Total Downloads : 462
 Authors : Padma V, Dr. Sudhakara Reddy Penubolu
 Paper ID : IJERTV3IS071310
 Volume & Issue : Volume 03, Issue 07 (July 2014)
 Published (First Online): 31072014
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
CORDIC Based DFT on FPGA for DSP Applications
Padma. V1
PG Scholar, Department of E.C.E SKIT College
Srikalahasti, India
Sudhakara Reddy. P2 Member IEEE Associate Professor, Department of E.C.E SKIT college
Srikalahasti, India
Abstract Discrete Fourier Transform (DFT) is a very useful algorithm, playing an important role in various digital signal processing (DSP) applications from radar, sonar, telecommunication, image processing etc. The fast Fourier transform (FFT) is a class of algorithms for efficiently computing DFT. So, the FFT is also widely used in many DSP applications. The computation complexity of DFT is drastically reduced by using FFT. The multiplications as plain rotation in the DFT/FFT algorithm, it is possible to apply a pipelined coordinate rotation digital computer (CORDIC) algorithm for the implementation of DFT/FFT. Hence, the CORDIC technique is applied for calculating a vector rotated through a given angle for computation of FFT. Further by exploiting some trigonometric identities in the DFT/FFT computation CORDIC rotators are effectively used. Here, the implementation of CORDIC based 16 point DFT on FPGA for various DSP applications is presented. The data rate of the DFT is depends only on carry look ahead adder used and number of pipelining stages used. With the advent of VLSI Technology, it is possible to perform each pipeline step is less than 25ns and also increases the throughput as per the input data rate. In this we simulate and implement a pipelined CORDIC DFT using FFT algorithm on FPGA for DSP applications. By rewriting the DFT, for = point DFT, it
requires + real multiplications and
+ + real Additions for a real data. The computation of multiplications in DFT is done as plane rotations. So, it is possible to apply a pipelined CORDIC algorithm in a hardware implementation of a longpoint DFT using of a single CORDIC pipeline.
Keywords CORDIC, DFT, DSP, FFT etc.

INTRODUCTION
Discrete Fourier transform (DFT) and Inverse Discrete Fourier transform (IDFT) are used in a wide range of applications, such as, signal processing, image processing, data compression, etc., for transforming signals from time domain to frequency domain and viceversa. Exponential increase in networking bandwidth and storage capacity has created a need for high throughput DFT/IDFT circuits. Further, low complexity circuits are preferable over high complexity circuits, since, in general, they consume less power and are less expensive from the silicon realestate perspective. Hence, low complexity DFT/IDFT circuits that have a high throughput are desirable. The Discrete Fourier Transform (DFT) is a very useful algorithm, playing an important role in various digital signal processing (DSP) applications. According to the definition, for – point DFT/IDFT, 42 real multiplications are needed. To reduce the computational
complexity of DFT, it is required to reduce the number of multiplications and additions in the DFT. The fast Fourier Transform (FFT) algorithm is the best solution why because it uses 2+1 + n + s real multiplications for = 2 – point FFT. In addition to software realizations of FFT, various hardware implementation of FFT have been designed to meet the high speed realtime requirements of DFT applications. Briefly, an FFT is an algorithm which provides a relatively fast means of computing a DFT by reducing the number of
operations from approximately 2 to 2 2 as compared to a DFT; this is generally accomplished by reducing the redundant operations and of course also provides a corresponding decrease in computation processing time[1],[12].
Fig.1 Signal flow graph (SFG) of the computation of 8point FFT
The inputs, (0) through (7) and generally (), are a time series of discrete samples. The corresponding outputs are frequency components i.e (0) to (7).
Fig.2 The basic building block of an FFT
The basic building block of an FFT as shown in Figure. 2 is a vector rotation by a complex weighting coefficient, , and a sum and difference of the result with another vector. If the un rotated vector is A = 1 + j1 , the rotated vector is
B = 2 + j2 , and = , the output of the basic building
1
1
2
block is given
= + + + [
+ ]
=
=0
1 2 2
1 2 2
1
1
= 1 2 + 2 + [1 2 2 ]
= 1 2 1 2 , (4)
(1)
As can be seen from the equation above, A' and B' can be
=0
1
2
=0
calculated in a digital computer by a combination of four multiplies and six adds in addition to looking up the
=
=0
trigonometric values in a table 1. Recently, the coordinate rotation digital computer (CORDIC) technique was introduced as a method of calculating a vector rotated through a given
1
= ()
=0
2
1
()
=0
2
, (5)
angle. This technique has also been applied to FFT's.
Table.1 Trigonometric values in a Look up table
Where all () are real data and then are complex,
= + ; , = 0,1,2, . . 1 (6)
for the computation DFT/IDFT we shall only discuss the following expressions
= 1 2 ,
=0
= 1 2 ,
=0
, = 0,1,2, . . 1
According to DFT/FFT properties, the symmetry property further reduces the computational complexity, that means = () and = (), can be limited to the interval 0 to 2 . So, the and
= 0,1,2, . .
2
1 are
= /21[ + ( )] 2
However, there is still a need for a CORDIC unit and a digital
=1
+ +
2
(0) (7)
processor in general wherein increased throughputs can be realized. Furthermore, some applications require that an FFT processor with a high throughput rate also have the capability to be programmable such that FFT's of different numbers of points can be performed. In CORDIC, The plane rotations are involve mainly two trigonometric factors of the expression
= /21[ + ( )] 2 (8)
=1
To continue the splitting, finally for each , then () and
() for are shown as
= , 3, 5, ,
2
() = 8( ), (9)
1
2
= =0
2
(2)
() = 8
8
. (10)
=0
= 1
(3)
The above computations are iterated until =
, this
are allowing the sums to be split in halves and recombined
containing trigonometric factors and compute
8
( ) and (),
recursively. The resulting recursive equations are fairly simple, employing plane rotations applied to terms referenced by indices obtained by reversing certain bit patterns. Because of the way multiplications and additions are organized as plane rotations in our DFT algorithm, it is possible efficiently to apply a pipelined CORDIC algorithm in a hardware implementation of the longpoint DFT. [11]
In section II mathematical representation of DFT/IDFT for both real and complex data are shown. Section III iscusses CORDIC based DFT/IDFT and possible hardware implementations, In section IV are explained the results analysis and implementation. Finally, Section V gives conclusion.

MATHEMATICAL REPRESENTATION OF DFT/IDFT

REAL DATA DFT/IDFT TRANSFORMATIONS
The = 2 real data DFT/IDFT transformations are defined as follows:
= 0,2, 4. It is observed that, the above expressions containing trigonometric factors in the form of plane rotations, which can be realized by the CORDIC algorithm.
For the DFT: = =
() = () (),
( ) = () + () (11) For IDFT: () = () ,
() = ()
() = () () ,
( ) = () + (). (12) A diagram of the computations for = 64 DFT using the above expressions is shown in Figure 1. Now we can calculate how many multiplications and additions are needed for an
= 2 DFT/IDFT. For any value of L, the number of CORDIC operations for computing () is
2
1 + . (13)
16 8
When runs through the values 1,2,4, , /8, the total
number of all CORDIC operations is then
1 +
=
+
(14)
1 1
2
16
2 16
8 2 16 2 2
fi n jfr n sin ,
(16)
All A(n) and B(n) values can be split to find an equivalent number of additions and multiplications:
=0
2
2(8 2 16 +2 2 ) = [2
( 5) + 2( 1)] +
1
2
[22 2( 2) + 2( 2)] (15)In which each of 2( 2) CORDIC operations computing
F n = F k e +
=0
1
1(0) and
1
need 1 multiplication and 2 additions, each
F k + jF k cos 2
(8)
r i
of 22 2( 2) CORDIC operations whose angle is just
equal to need only2 multiplications and 2 + 4 additions , and
=0
1
2
4
each of 22(n 5) + 2(n 1) CORDIC operations can
Fi k + jFr k sin .
=0
(17)
be realized by using 3 multiplications and 3 + 4 additions. Therefore the total number of real multiplications required 22(3n 13) + 4n 2 , and the number of real additions required is 22(7n 29) + 6n + 2. The traditional Fast Fourier Transform (FFT) uses 2n2 real multiplications and 3n2 additions. The complex multiplication is realized by 3 real multiplication and 3 real additions. We need only 21(n 3) + 2 multiplications and 21(3n 5) + 4 additions. Table 1 shows a comparison between these and our method for different values of . From the table we see that for large values of N our method use a few more multiplications, but fewer additions. But the total number of operations is lower, which is important if addition and multiplication take the same time, as in many DSP Processors. In the above we have just discussed the DFT/IDFT for real data.
Table 2. Number of real multiplications in FFT and CORDIC based FFT for DFT algorithms.
Table 3. Number of real additions in FFT and CORDIC based FFT for DFT algorithms.

Complex data DFT/IDFT transformations:
For complex data
f(n) = + ,
So for complex data, using the new algorithm, the number of real multiplications and real additions is twice the number of operations for real data. Compared to the traditional FFT/IFFT algorithm, the new algorithm requires more data address computations, as shown in equations (2),(3), (4) and (5). However, it is easy to realize these data address computations
1
in Hardware or by table lookup, noting that aj 21
=1
in binary is just a part of the reversed bit pattern of the binary
1
representation for = aj 2 .
=0


CORDIC BASED DFT/IDFT
In CORDIC based DFT/IDFT using FFT/IFFT algorithm, it is found that, the number of real multiplications and real additions is respectively reduced to 22(3 13) + 4 2 and 22(7 29) + 6 + 2 for real data DFT. The main advantage of this proposed algorithm is to reduce the total number of floating point operations is very less and smaller. Hence, first it is required to rewrite the DFT in terms of FFT either in decimationintime (DIT) or decimationinfrequency (DIF) secondly, expresses the FFT computations in terms of the plane rotations. The number of such rotations are 22( 4) + 2( 1) needed for = 2 – point FFT. The number of additions and multiplications quoted above corresponds to the implementation of each rotation as a complex multiplication.
Table 4. Comparison of some performance parameters
We iteratively use 16 butterfly processors. Figure 3 shows one butterfly processor architecture. Finally all F(k) are calculated(Note actually only 32 values of
() = () () are calculated, since ( ) =
we have , F(k) =
1
1 f n e
=0
2
() + (), the other 32 values are easily obtained). Considering all types of butterfly processors
1 1
2
computations, it is possible to design a butterfly processor
= fr n + jfi n cos +
=0
based on standard multipliers as shown in Figure 3.
Where c1, sl, s2 are used to control the compu tations. Of the 16 butterfly processors, 6 needs five multipliers and the other 10 need four multipliers.
Fig.3.The structure of a butterfly processor
For IDFT, = and =
Bec a us e () = () () a nd ( ) = () +
() for IDFT, an extra step is needed at the end for those computations. The remain ing computations are the same as for the DFT. In order to conveniently construct a long point DFT/IDFT by using a pipelined CORDIC processors. It is better design comparatively standard form of DFT/IDFT. A CORDIC computation performs a vector rotation:
Fig.5 Circular rotation mode
Fig.6 Hyperbolic rotation mode
=
, (18)
The realization of CORDIC based DFT can be
obtained by using hyperbolic rotation mode as shown in
figure 6.
by a number of micro rotations using a predetermined set of angles Ã˜ derived from . CORDIC processors would be needed to achieve that input rates.
Fi g . 7 . A P i p e l i n e d C O R D IC s t r u c t u r e
The pipelined CORDIC structure is shown in Figure 7, in which the leftmost part is controlled by 0 , 1 , , performs the equivalent of four multiplications, one addition and one subtraction. But us ing the CORDIC algorithm, the results must be multiplied by a constant
1
1.6467603,
It is approximated as
1
1 (1 +
2
1)(1
4
1 ) (1 +
32
1
256
)(1
1 )
1024
Fig.4 64point CORDIC based DFT computing algorithm
We can see from the above analysis that the DFT/IDFT can be calculated using CORDIC computations and additions/subtractions. This was actually our main motivation for the derivation o f CO RDI C bas ed met hod. It i s q uite e ffic ient for a high speed long point DFT/IDFT.
Fig.9 Simulation results
Initially the DFT was decomposed in terms of COS and SINE terms by using Eulers formula, then for the computation of these trigonometric components we used pipelined CORDIC processor. For hardware implementation, we written Verilog code and compiled by using ModelSim software. Further simulated and synthesized by using Xilinx ISE design suite version 12.0 and implemented on Spartan 6.0 FPGA. Finally synthesis report and delay report are noted down. From the results it is observed that, the total real time taken for execution is 1.00secs, the total CPU time taken for execution is 0.94 secs. The macro statistics of CORDIC requires only single ROM, one 4×8bit ROM, 20adders and subtractions, three 8bit adders, one 8bit subtrction, 32 registers, eight 2 bit registers, 24 8bit registers and 2 multiplexers.
V. CONCLUSIONS
By exploiting some trigonometric identities in the DFT computation, the number of multiplications needed for DFT/IDFT has been reduced com pared to the traditional FFT. One way to realize the new algorithm in hardware, which is more efficient for a longpoint DFT, is to use a pipelined CORDIC algorithm, which simultaneously completes four multiplications. In the pipelined CORDIC structure, the data rate of the DFT/IDFT depends only on one carry lookahead adder as the critical component in the pipeline. At present integration technology, it is possible to perform each pipeline step is as minimum as possible.
REFERENCES

FENG ZHOU,A New Fast Discrete Fourier Transform ,Dept. of Information and Electronic Engineering, Zhejiang University, Hangzhov,P.R.China. PETER KORNERUP,Dept. of Mathematics and Computer Science, Odense University, 2002,Odense, Denmark

H. L. Groginsky et al., "A Pipeline Fast Fourier Transform", IEEE Trans. on Computers, vol. C19, No. 11, Nov. 1970, pp. 10151019.

J. E. Volder, "The Cordic Trigonometric Computing Technique", IEEE Trans. on Elect. Computers, vol. ECB, Sep. 1959, pp. 330334.

Y.H. Hu5 "CORDICBased VLSI Architecture for Digital Signal Processing," IEEE Signal Processing Magazine, pp.lJuly 1992

R. Sarmineto, F. Tobajas, et.al, "A CORDIC Processor for FFTComputations and Its Implementation Using Gallium Arsenide Technology," IEEE Trans, on VLSI Systems, pp.1830, vol.6, no.l,
March 1998

C. Ying, S.Chen and J. Chih, "Efficient CORDIC Designs for Multi
Mode OFDM FFT," ICASSP Page(s): 1036 1039, vol.3, May 2006

B . Heyne, J. Gotze, "A Pure CORDIC based FFT for Reconfigurable Digital Signal Processing, " 12th European Signal Processing Conference (EUSIPCO2004), Vienna, Austria, 2004

B . Heyne, J. Gotze, "CORDICBased algorithms for software defined Radio (SDR) baseband Processing, " Adv. Radio Sci. , 4, 179184, 2006

Kulkarni et al., "Reconfigurable VectorFFT/IFFT, Vector Multiplier/Divider, " Patent US 7,082,45 1 B2 HEYNE B ET AL: "A Pure Cordic Based FFT for Reconfigurable Digital Signal Processing" PROCEEDINGS OF THE EUROPEAN SIGNAL PROCESSING CONFERENCE, 6 September 2004 (20040906), pages 15131516,

KATSUTOSHI SEKI ET AL: "A CordicBased Reconfigrable Systolic Array Processor for MIMOOFDM Wireless Communications" SIGNAL PROCESSING SYSTEMS, 2007 IEEE WORKSHOP ON, IEEE, PI, 1 October 2007 (20071001), pages 639 644, XP031164039 ISBN: 9781424412211

Ayan Banerjee Anindya Sundar dhar, "FPGA realization of a CORDIC based FFT processor",Micro Processor and Micro system, vol. 25, Issue 3, May. 2001, pp. 131142.