[go: up one dir, main page]

0% found this document useful (0 votes)
31 views5 pages

FPGA Implementation of Fast Fourier Transform

The document discusses the implementation of the Fast Fourier Transform (FFT) using Field Programmable Gate Arrays (FPGAs), focusing on the radix-2 Decimation-In-Time (DIT) FFT algorithm. It highlights the advantages of using FPGAs for efficient computation in various applications, including communications and signal processing, and compares the logic utilization of 16-bit and 32-bit FFT designs synthesized in Verilog. The results indicate that while both designs are effective, the 32-bit implementation requires more resources, reflecting increased complexity and utilization.

Uploaded by

sauravir248
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views5 pages

FPGA Implementation of Fast Fourier Transform

The document discusses the implementation of the Fast Fourier Transform (FFT) using Field Programmable Gate Arrays (FPGAs), focusing on the radix-2 Decimation-In-Time (DIT) FFT algorithm. It highlights the advantages of using FPGAs for efficient computation in various applications, including communications and signal processing, and compares the logic utilization of 16-bit and 32-bit FFT designs synthesized in Verilog. The results indicate that while both designs are effective, the 32-bit implementation requires more resources, reflecting increased complexity and utilization.

Uploaded by

sauravir248
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

FPGA Implementation of Fast Fourier Transform

1
Vanmathi.K, 2 Sekar.K, 3 Remya Ramachandran

1. Associate Professor, Department of EEE, Hindusthan College of Engineering and Technology,


Coimbatore, Tamil Nadu.
2. Assistant Professor, Department of EEE, Hindusthan College of Engineering and Technology,
Coimbatore, Tamil Nadu.
3. PG Scholar, Department of EEE, Hindusthan College of Engineering and Technology, Coimbatore,
Tamil Nadu. rcremya471@gmail.com

Abstract - The Fast Fourier Transform is one of the crucial to minimize the power consumption and the
most widely used digital signal processing amount of hardware [14].
algorithms. It is used to compute the Discrete A new efficient FFT algorithm for
Fourier Transform and its inverse. As a result, these OFDM/DMT applications can achieve higher
are widely used for many applications in processing rate and better efficiency than the
engineering, science, and mathematics which conventional algorithm and is very suitable for the
include areas such as: communications, signal OFDM/DMT applications such as the WLAN,
processing, instrumentation, biomedical DAB/DVB, and ADSL/VDSL systems [6]. Several
engineering, numerical methods, sonics and communication systems need medium resolution
acoustics, and applied mechanics. It is described as (9–12 bits) analog-to-digital converters (ADCs)
the most important numerical algorithm of our with bandwidths in the tens of MHz range [1]. The
lifetime. The number of applications for this main reason for its widespread use is the existence
transform continues to grow. The Decimation-In- of efficient techniques for its computation.
Time radix-2 FFT using butterflies is designed. The Field Programmable Gate Arrays (FPGAs) are
butterfly operation is faster. The outputs of the programmable semiconductor devices that are
shorter transforms are reused to compute many based around a matrix of Configurable Logic
outputs, thus the total computational cost becomes Blocks connected through programmable
less. The 16 bit and 32 bit inputs are synthesized interconnect. The speeds can be very fast, and
using Verilog. The logic utilization obtained from multiple control loops can run on a single FPGA
the design summary of 16 and 32 bit radix-2 DIT device at different rates. FPGAs can be
FFT can be compared. The utilization factor programmed to the desired application or
increases as the number of bits increases. The functionality requirements. Xilinx FPGAs allow for
design is developed using hardware description field upgrades to be completed remotely,
language Verilog on Xilinx 14.2 xc3s500E. The eliminating the costs associated with re-designing
spartan3-tyro plus is used as hardware to or manually updating electronic systems. For
implement the complex FFT values. implementing reconfigurable, industrial electronics
systems FPGAs are commonly used. [11]. The
Keywords – Fast Fourier Transform (FFT), FPGA implementation of a common FFT can
Decimation in Time (DIT), Field Programmable perform transforms over finite and infinite fields.
Gate Array (FPGA) This work presents the implementation of the
radix-2 Decimation in Time (DIT) FFT in an FPGA
I. INTRODUCTION using HDL. The design summary of 16 and 32 bit
FFT input numbers are compared. The spartan3-
tyro plus in the FPGA Spartan 3E family is used as
A Fast Fourier Transform (FFT) is an hardware to implement the complex FFT values.
algorithm to compute the Discrete Fourier
Transform (DFT) and its inverse. A Fourier
transform converts time (or space) to frequency and II. RADIX-2 DIT FFT
vice versa; an FFT rapidly computes such
transformations. As a result, FFTs are widely used A. Fast Fourier Transform (FFT)
for many applications in engineering, science, and
mathematics. FFTs have been described as the The proposed method is based on radix-2 FFT
most important numerical algorithm of our lifetime. computed using butterfly diagrams. The DIT radix-
The signal processing applications need high 2 FFT recursively partitions a DFT into two half-
throughput, so the aim of the fused elements is to length DFTs of the even-indexed and odd-indexed
reduce the circuit area first and then to reduce the time samples. The outputs of these shorter FFTs are
delay [12]. The reduction of FFT complexity is reused to compute many outputs, thus greatly

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on April 15,2025 at 16:39:05 UTC from IEEE Xplore. Restrictions apply.
reducing the total computational coost. This FFT is y0 = x0 + x1 (5)
then implemented in FPGA. y1 = x0 - x1 (6)
Cooley and Tukey first introduuced the concept
of FFT by efficiently making use ofo symmetry and where k is an integer dependding on the part of the
periodicity properties of the twiddlle factors for its transform being computed [13].An
[ IEEE floating-
significant computational reductionn [2]. The FFT point adder accepts normaliized numbers, supports
and inverse FFT algorithms are the important all four IEEE rounding moodes, and outputs the
processing blocks used for data conversion
c from correctly normalized roundedd sum/difference in the
time-to-frequency domain (FFT T) and from format required by the IEEE Standard [9].
frequency-to-time domain (IFFT)) [4]. In most
digital circuits, the modulators annd demodulators B. 16 BIT RADIX-2 DIT
D FFT
can be implemented using simple FFTF blocks. FFT
processor’s effectiveness plays an immportant role in The length-16, DIT FF FT with the input data
optimizing system performance. It is able to scrambled and output data in order of the 16 bit
perform flexible-size FFTs which a processor is butterfly diagram is shown inn Figure 1. The twiddle
desired [5]. factor computation is donne at each stages of
An FFT is a way to compute the DFT more butterfly diagram. This 16 bit
b radix-2 DIT FFT is
quickly. Computing the DFT of N points in the designed and simulated in i Spartan 3E using
naive way using the definitionn takes O(N2) Verilog synthesis.
arithmetical operations while a FF FT can compute
the same DFT in only O(N log N) operations. The
difference in speed can be enormous.e The
computation time improvement is proportional to N
/ log (N). FFT becomes a commonn operator for a
terminal which supports several standards
s where
many frequency domain algorithm ms are involved
[3]. A Wireless Personal Area Netw work connection
for uncompressed video at a small cost is obtained
using the multiband OFDM UWB B proposal [10].
There are two types of computaational elements
called butterfly units. DIT computaational elements
consist of complex multiplication(ss) followed by a
sum and difference network. Decimation In
Frequency (DIF) computational elem ments consist of
a sum and difference network follow wed by complex
multiplication(s).
The DFT for a sequence of lenggth N is defined Figure 1: 16 Bit FFT Butterfly
B Diagram
by
C. 32 BIT RADIX-2 DIT
D FFT
п /
X(k)=∑ ,
0 , … … ., N-1 (1) The processing of the daata is same as in 16 bit
radix-2 DIT FFT. The onlyy difference is that the
Where x(0),...., x(N-1) be com mplex numbers length is 32. The number of butterfly unit for
[7].Using the twiddle factor, the DFT
D can be re- processing also increases. The complexity will
written as increase. This is designed and simulated using
Verilog synthesis in Spartan 3E.
X(k) ,
III. SYNTHESIS RESULTS AND
0 , … … ., N-1 (2) DISCUSSION
Direct computation of the DFT T is basically
inefficient because it does not explooit the symmetry The 16 and 32 bit raadix-2 DIT FFT are
and periodicity properties of the phhase factor WN. synthesized in Spartan 3E starter board as the
These properties are [8]: evaluation development booard. The family is
Spartan 3E, the device ussed is xc3s500E, the
/ package is FG320 and the sppeed is -4. The top level
Symmetry property: (3) source type is HDL, thhe synthesis tool is
Periodicity property: = (4) XST(VHDL/Verilog), thhe simulator is
Isim(VHDL/Verilog) and the VHDL source
FFT uses butterfly operations for coomputations. For analysis standard is VHDL-993.
a basic radix-2 butterfly, the operation is defined as

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on April 15,2025 at 16:39:05 UTC from IEEE Xplore. Restrictions apply.
For 16 bit, the 16 bit binary num
mbers as 4 words
in X0[3:0], X1[3:0], X2[3:0], X3[33:0] are given as
the input and after the transformatioon using Verilog
coding, the output is obtained as a 4 words in
Y0[3:0], Y1[3:0], Y2[3:0], Y3[3:0] consists of 16
bit binary numbers. The transform mation is done
with the help of butterfly diagraams. The result Figure 4: RTL Schematic of stage0 to stage 11of 16
obtained after simulation in the IsimI window is Bit Input DIT Raadix-2 FFT
shown in figure 2.
The RTL Schematic of DFT top of 16 Bit Input
DIT Radix-2 FFT is shown in figure 5. The inputs
are given in the left side annd outputs are obtained
through the right side afteer processing the FFT
values.

Figure 2: Simulation Output of 16 Bit Input DIT


Radix-2 FFT
Figure 5: RTL Schematic off dft top of 16 Bit Input
For 32 bit, the 32 bit binarry numbers in
DIT Radix-22 FFT
x_r[31:0] and x_i[31:0] are given as the input of
real and imaginary parts annd after the
The 16 and 32 bit synthhesis results and design
transformation using Verilog codinng, the output of
summary shows that the utiilization factor is more
real and imaginary parts are obtainned in y_r[31:0]
for 32 bit radix-2 DIT FFT than
t 16 bit radix-2 DIT
and y_i[31:0]. The transformation isi done with the
FFT. Table 1 shows the com mparison of 16 and 32
help of butterfly diagrams. The resuult obtained after
bit logic utilization of xc3s5500E for implementing
simulation in the Isim window is shoown in figure 3.
this FFT algorithm. The 32 bit b FFT has more logic
utilization than 16 bit FFT. Thus as the number of
FFT bits for processing increases the logic
utilization also increases. Thhe design can be done
in this device since the percentage of logic
utilization comes within 1000%. The design cannot
be able to done with the device, if the utilization of
the device is more than 100% %.
The logic utilization detaiils of 32 bit shows that
the number of slices is 8%, slice FF is 6%, 4 input
Figure 3: Simulation Output of 32 Bit Input DIT LUTs is 7%, bonded IOBs is 15%, BRAMs is
Radix-2 FFT 60%,18*18 SIOs is 80% and a Gclks is 4%. The
utilization details of 16 bit shows
s that the number
The 16 bit FFT DIT radix-2 conntains 12 stages of slices is 6%, slice FF is 4%
%, 4 input LUTs is 5%,
from stage 0 to stage 11as show wn in the RTL bonded IOBs is 15%, BRAM Ms is 50%,18*18 SIOs
schematic of figure 4. Each stagee is interlinked. is 60% and Gclks is 4%. Thhe device utilization is
The output of the stage 0 is givenn as input to the related to the complexity of the device. The
stage 1. Its output is connected as input to stage 2 complexity of the system inncreases as the device
and so on. At the last stage, stage 11,
1 the output of utilization increases. The 16 bit has low complexity
the whole process is obtained. The resistor with that of 32 bit. So as the number of bits is
transistor logic diagram showss the internal increasing, the complexitty and the device
connections of the top dft with clock
c and reset utilization also increases.
connected in each stage.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on April 15,2025 at 16:39:05 UTC from IEEE Xplore. Restrictions apply.
Table 1: Comparison of 16 and 32 bit logic The FFT input complex numbers with real and
utilization imaginary parts are feeded in to the Spartan 3 tyro
Utilizatio plus kit through coding after resetting the kit. The
Logic Used Available hardware FPGA Spartan family Spartan 3 tyro plus
n
utilizatio kit is shown in figure 6 is a 32 bit processor.
16 32 16 32
n 16 bit 32 bit
bit bit bit bit
No. of
285 403 4656 4656 6% 8%
slices
No. of
456 624 9312 9312 4% 6%
slice FFs
No. of 4
input 487 670 9312 9312 5% 7%
LUTs
No. of
15 15
bonded 36 36 232 232
% %
IOBs
No. of 50 60
10 12 20 20
BRAMs % %
No. of Figure 7: Hardware Implementation Result
MULT 60 80
12 16 20 20
18*18 % % The inputs are given as 16 bit complex
SIOs numbers. After the 12 stages of processing, the
No. of result is obtained as shown in figure 7. The output
1 1 24 24 4% 4%
GCLKs is obtained in personal computer display with real
and imaginary parts in it.

IV. HARDWARE IMPLEMENTATION V. CONCLUSION

The Spartan-3 EDK Board provides a powerful, In this work, the design of 16 and 32 bit radix-
self-contained development platform for designs 2 DIT FFT is synthesized using Xilinx. Since FFT
targeting the new Spartan-3 FPGA from Xilinx. It is theoretically fast operation, the processing speed
features a 200K gate Spartan-3, on-board I/O of the algorithm is high. The total computational
devices, and 1MB fast asynchronous SRAM, cost of this is low as the shorter ones are reused to
making it the perfect platform to experiment with compute many outputs. The simulation of 16 and
any new design, from a simple logic circuit to an 32 bit radix-2 FFT is done and outputs are obtained
embedded processor core. in Isim window. The logic utilization details of 32
bit and 16 bit is compared. The logic utilization
details of 32 bit shows that the number of slices is
8%, slice FF is 6%, 4 input LUTs is 7%, bonded
IOBs is 15%, BRAMs is 60%,18*18 SIOs is 80%
and Gclks is 4%. The utilization details of 16 bit
shows that the number of slices is 6%, slice FF is
4%, 4 input LUTs is 5%, bonded IOBs is 15%,
BRAMs is 50%, 18*18 SIOs is 60% and Gclks is
4%. The utilization factor of xc3s500E for
synthesizing DIT radix-2 FFT increases as the
number of bits for processing increases. The
hardware FPGA in the Spartan family, Spartan 3
tyro plus is used to implement the 16 bit FFT
complex values and the result is obtained in the
personal computer.

REFERENCES
[1] Ankesh. J, Muthusubramaniam. V and
Shanthi. P, “Analysis and Design of a High
Speed Continuous-time Δ∑ Modulator Using
Figure 6: TYRO PLUS SPARTAN3 the Assisted Opamp Technique”, IEEE
(EDK) Board Components placement top view

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on April 15,2025 at 16:39:05 UTC from IEEE Xplore. Restrictions apply.
Journal of Solid-State Circuits, vol. 47, no. 7, [13] Tao. Y, Deyuan. G and Xiaoya. F, “A Multipath
pp. 1615-1625, July 2012. Fused Add-Subtract Unit for Digital signal
[2] Chang W and Nguyen T.Q, “On the Fixed- Processing”, Proc. IEEE Midwest Symp.
Point Accuracy Analysis of FFT Algorithms”, Circuits and Systems, pp. 448-452, 2012.
IEEE Transactions on Signal Processing, vol. [14] Yu-Heng. G.L, and Chien-In. H.C, “Dynamic
56, no. 10, pp.4673-4682, October 2008. Kernel Function FFT With Variable Truncation
[3] Ghouwayel. A.A and Louet. Y, “FPGA Scheme for Wideband Coarse Frequency
Implementation of a Re-configurable FFT for Detection”, IEEE Transactions on
Multi-standard Systems in Software Radio Instrumentation and Measurement, vol. 58, no.
Context”, IEEE Transactions on Consumer 5, pp. 1555-1562, May 2009.
Electronics, vol. 55, no. 2, pp. 950-958, May
2009.
[4] Guan. X, Fei. Y, Lin. H, “Hierarchical Design
of an Application-Specific Instruction Set
Processor for High-Throughput and Scalable
FFT Processing”, IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, vol.
20, no. 3, pp. 551-563, March 2012.
[5] Guichang. Z, Fan. X,and Alan. N.W, “A
Power-Scalable Reconfigurable FFT/IFFT IC
Based on a Multi-Processor Ring”, IEEE K.VANMATHI received
journal of solid-state circuits, vol. 41, no. 2, B.E and M.E Degrees in the years 1990 and
pp. 483-495, Feb. 2006. 2004 respectively from PSG College of
[6] Jung Y, Yoon H and Kim J, “New Efficient Technology, Coimbatore, Tamil Nadu, India.
FFT Algorithm and Pipeline Implementation
She has teaching experience of sixteen years.
Results for OFDM/DMT Applications”, IEEE
Transactions on Consumer Electronics, vol. She is presently working as Associate Professor
49, no. 1, pp.14-20, February 2003. in the Department of Electrical and Electronics
[7] Lee Hand In-Cheol P, “Balanced Binary-Tree Engineering at Hindusthan college of
Decomposition for Area-Efficient Pipelined Engineering and Technology. Her areas of
FFT Processing”, IEEE Transactions on interest are Linear machines, FEA of machines
Circuits and Systems—I: Regular Papers, vol. EMI and EMC issues. She has published many
54, no. 4, pp.889-900, April 2007.
papers in National / International Conferences
[8] Ramesh.B, Ammu. K, Pallavi. V and
Meera.V, “Modified FPGA based Design and to her credit.
Implementation of Reconfigurable FFT
Architecture”, IEEE symposium, pp. 818-822,
2013.
[9] Seidel P.M and Even G, “Delay-Optimized
Implementation of IEEE Floating-Point
Addition”, IEEE Trans. Computers, vol. 53,
no. 2, pp. 97-113, February 2004.
[10] Sherrat R.S, Cadenas O and Goswami N, “A
Low Clock Frequency FFT Core
Implementation for Multiband Full-Rate Remya Ramachandran
Ultra-Wideband (UWB) Receivers”, IEEE received B.Tech Degree in Electronics and
Transactions on Consumer Electronics, vol. Communication Engineering in the year 2008
51, no. 3, pp. 798-802, August 2005. from Calicut University, Kerala, India. She is
[11] Stephen. M and Roger. W, “Power Efficient, currently doing M.E. Degree in Applied
FPGA Implementations of Transform Electronics under Anna University, Chennai,
Algorithms for Radar-Based Digital Receiver Tamil Nadu, India. She has published many
Applications”, IEEE Transactions on papers in National / International Conferences
Industrial Informatics, vol. 9, no. 3, pp. 1591- to her credit. Her areas of interest are Signal
1600, Aug. 2013. Processing, Image Processing, Communication
[12] Swartzlander. E.E, and Hani. H.M. Saleh, and VLSI.
“FFT Implementation with Fused Floating-
Point Operations”, IEEE Transactions on
Computers, vol. 61, no. 2, pp.284-288,
February 2012.

Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on April 15,2025 at 16:39:05 UTC from IEEE Xplore. Restrictions apply.

You might also like