[go: up one dir, main page]

0% found this document useful (0 votes)
18 views24 pages

DSP Chap9

The document discusses the historical development and computation of the Fast Fourier Transform (FFT), tracing its origins back to Gauss and highlighting key contributions from researchers like Cooley and Tukey. It explains the inefficiencies of direct computation of the Discrete Fourier Transform (DFT) and introduces algorithms like the Goertzel algorithm and decimation in time FFT for efficient computation. The document also includes technical details on the mathematical formulations and computational complexities involved in these algorithms.

Uploaded by

曾偉博
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views24 pages

DSP Chap9

The document discusses the historical development and computation of the Fast Fourier Transform (FFT), tracing its origins back to Gauss and highlighting key contributions from researchers like Cooley and Tukey. It explains the inefficiencies of direct computation of the Discrete Fourier Transform (DFT) and introduces algorithms like the Goertzel algorithm and decimation in time FFT for efficient computation. The document also includes technical details on the mathematical formulations and computational complexities involved in these algorithms.

Uploaded by

曾偉博
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Digital Signal Processing

Chap. 9: Computation of DFT


林嘉文 (Chia-Wen Lin)
清華大學電機系
cwlin@ee.nthu.edu.tw

2 Historical Review of FFT (1/3)


◼ Heideman et al. (1984) traced the origin of FFT back to Gauss
(1805).
◼ Runge (1905) and later Danielson and Lanczos (1942) described
roughly 𝑁 log 2 𝑁 instead of 𝑁2 computation algorithms.
◼ 1965 Cooley & Tukey published an Fast DFT algorithm
applicable when N is a composite number (product of 2 or
more numbers).
◼ Cooley & Tukey sparked a flurry of activity in the application of
DFT to signal processing and resulted in the discovery of a
number of highly efficient computational algorithms collectively
known as FFT.
2024/6/4

1
3 Historical Review of FFT (2/3)
◼ The divide-and-conquer FFT algorithms described here are
often referred to as the Cooley-Tukey FFT algorithms
◼ Special Case: Radix-r FFT and prime factor decomposition

James W. Cooley John W. Tukey


(1926 ~2016 ) (June1915 – July 2000)

J. W. Cooley & J. W. Tukey (1965): "An algorithm for the machine calculation
of complex Fourier series", Math. Comput. 19, 297–301.
2024/6/4

4 Historical Review of FFT (3/3)


◼ Direct computation of the DFT is basically inefficient, primarily
because it does not exploit the symmetry and periodicity of the
phase factor 𝑊𝑁
◼ FFT algorithms are based on the fundamental principle of
decomposing the computation of the DFT of a sequence of
length 𝑁 into successively smaller DFTs.
◼ Decimation in time FFT: decompose 𝑥[𝑛] into successively
smaller subsequences.
◼ Decimation in frequency FFT: decompose DFT coefficients 𝑋[𝑘]
into smaller subsequences.
◼ Goertzel’s algorithm (1958): not restricted to the computation of
DFT but any desired set of samples of the DTFT of a sequence.
2024/6/4

2
5 Direct Computation of DFT
For a complex-valued sequence of 𝑁 C
C DFT SUBROUTINE
points the DFT may be expressed as C ISEL = 0 : DFT
C ISEL = 1 : INVERSE DFT
𝑁−1 C
2𝜋𝑘𝑛 2𝜋𝑘𝑛 SUBROUTINE DFT(N, XR, XI, XFR, XFI, ISEL)
𝑋𝑅 (𝑘) = ෍ 𝑥𝑅 (𝑛) cos + 𝑥𝑖 (𝑛) sin DIMENSION XR(N), XI(N), XFR(N), XFI(N)
𝑁 𝑁
𝑛=0 WN = 6.2831853 / FLOAT(N)
𝑁−1 IF (ISEL.EQ.1) WN = - WN
2𝜋𝑘𝑛 2𝜋𝑘𝑛 DO 20 K = 1, N
𝑋𝐼 (𝑘) = − ෍ 𝑥𝑅 (𝑛) sin − 𝑥𝑖 (𝑛) cos XFR(K) = 0,
𝑁 𝑁 XFI(K) = 0,
𝑛=0
KM1 = K – 1
DO 20 I = 1, N
• The direct computation requires: IM1 = I – 1
ARG = WN * KM1 * IM1
• 2𝑁2 evaluations of trigonometric C = COS(ARG)
S = SIN(ARG)
functions. XFR(K) = XFR(K) + XR(I)*C + XI(I)*S
XFI(K) = XFI(K) – XR(I)*S + XI(I)*C
• 4𝑁2 real multiplications. 10 CONTINUE
IF (ISEL – 1) 20, 30, 20
• 4𝑁(𝑁 – 1) real additions. 30 XFR(K) = XFR(K) / FLOAT(N)
20 XFI(K) = XFI(K) / FLOAT(N)
• A number of indexing and CONTINUE
addressing operations. RETURN
END 2024/6/4

6 The Goertzel Algorithm (1/3)

Note 𝑊𝑁−𝑘𝑁 = 𝑒 𝑗2𝜋𝑘𝑁/𝑁 = 𝑒 𝑗2𝜋𝑘 = 1


𝑁−1 𝑁−1 𝑁−1
−𝑘(𝑁−𝑟)
⇒ 𝑋[𝑘] = ෍ 𝑥[𝑟]𝑊𝑁𝑘𝑟 = 𝑊𝑁−𝑘𝑁 ෍ 𝑥[𝑟]𝑊𝑁𝑘𝑟 = ෍ 𝑥[𝑟]𝑊𝑁
𝑟=0 𝑟=0 𝑟=0

−𝑘(𝑛−𝑟)
Define 𝑦𝑘 [𝑛] = ෍ 𝑥[𝑟]𝑊𝑁 𝑢[𝑛 − 𝑟] = 𝑥[𝑛] ∗ (𝑊𝑁−𝑘𝑛 𝑢[𝑛])
𝑟=−∞
𝑛
−𝑘(𝑛−𝑟)
= ෍ 𝑥[𝑟]𝑊𝑁 𝑢[𝑛 − 𝑟]
𝑟=−∞
𝑛
−𝑘(𝑛−𝑟)
= ෍ 𝑥[𝑟]𝑊𝑁 𝑢[𝑛 − 𝑟] (∵ 𝑥[𝑟] finite length)
𝑟=0
𝑁 𝑁 𝑁
−𝑘(𝑁−𝑟) −𝑘(𝑁−𝑟)
⇒ 𝑋[𝑘] = 𝑦𝑘 [𝑛]ȁ𝑛 = 𝑁 = ෍ 𝑥[𝑟]𝑊𝑁 𝑢[𝑁 − 𝑟] = ෍ 𝑥[𝑟]𝑊𝑁 = ෍ 𝑥[𝑟]𝑊𝑁𝑘𝑟
𝑟=0 𝑟=0 𝑟=0

2024/6/4

3
7 The Goertzel Algorithm (2/3)

−𝑘(𝑛−𝑟)
𝑦𝑘 [𝑛] = ෍ 𝑥[𝑟]𝑊𝑁 𝑢[𝑛 − 𝑟] = 𝑥[𝑛] ∗ (𝑊𝑁−𝑘𝑛 𝑢[𝑛])
𝑟=−∞
1 1 − 𝑊𝑁𝑘 𝑧 −1 1 − 𝑊𝑁𝑘 𝑧 −1
⇒ 𝐻𝑘 (𝑧) = = =
1− 𝑊𝑁−𝑘 𝑧 −1 1− 𝑊𝑁−𝑘 𝑧 −1 1− 𝑊𝑁𝑘 𝑧 −1 1 − 2 Re{ 𝑊𝑁−𝑘 }𝑧 −1 + 𝑧 −2
1 − 𝑊𝑁𝑘 𝑧 −1
=
1− 2 cos( 2𝜋𝑘/𝑁)𝑧 −1 + 𝑧 −2
𝑁 𝑁 𝑁
−𝑘(𝑁−𝑟) −𝑘(𝑁−𝑟)
𝑋[𝑘] = 𝑦𝑘 [𝑛]ȁ𝑛 = 𝑁 = ෍ 𝑥[𝑟]𝑊𝑁 𝑢[𝑁 − 𝑟] = ෍ 𝑥[𝑟]𝑊𝑁 = ෍ 𝑥[𝑟]𝑊𝑁𝑘𝑟
𝑟=0 𝑟=0 𝑟=0

• 4 real-mul, 4 real add/𝑦𝑘 [𝑛] →4𝑁, 4𝑁 − 2


• All 𝑦𝑘 [𝑛], 𝑛 = 0,1, . . , 𝑁 needed
• 2 real-mul/𝑦𝑘 [𝑛] →2𝑁 2024/6/4

• only 𝑦𝑘 [𝑁] needed

8 The Goertzel Algorithm (3/3)


◼ If the input 𝑥 𝑛 is complex, only two real multiplications per sample are
required to implement the pole of the system.
◼ Since we only need to bring the system to a state from which
𝑦𝑘 𝑁 can be computed, the complex multiplication by −𝑊𝑁𝑘
required to implement the zero of the system function need not
be performed at every iteration of the difference equation, but
only after the 𝑵-th iteration.
◼ Thus the total computation is 𝟐(𝑵 + 𝟐) real multiplications
(compared to 4N real multiplications) and 𝟒(𝑵 + 𝟏) real additions
(compared to 4N−2 real additions) . Furthermore, only cos 2𝜋𝑘/𝑁
and −𝑊𝑁𝑘 are the coefficients.
◼ Useful when only compute 𝑀 (𝑀 < log2𝑁) DFT values.
2024/6/4

4
9 Decimation in Time FFT
𝑁−1 𝑁−1

For 𝑁−point DFT:𝑋[𝑘] = ෍ 𝑥[𝑛]𝑒 −𝑗2𝜋𝑘𝑛/𝑁 = ෍ 𝑥[𝑛]𝑊𝑁𝑘𝑛 , 𝑘 = 0, 1, . . , 𝑁 − 1


𝑛=0 𝑛=0
Complex multiplications: 𝑁 × 𝑁 = 𝑁 2 , Complex additions: 𝑁 × (𝑁 − 1)

If 𝑁 = 2𝜈 , 𝑋[𝑘] = ෍ 𝑥[𝑛] 𝑊𝑁𝑘𝑛 + ෍ 𝑥[𝑛] 𝑊𝑁𝑘𝑛 (𝑛 = 2𝑚, 𝑛 = 2𝑚 + 1)


𝑛 even 𝑛 odd
𝑁/2−1 𝑁/2−1
𝑘(2𝑚) 𝑘(2𝑚+1)
= ෍ 𝑥[2𝑚] 𝑊𝑁 + ෍ 𝑥[2𝑚 + 1] 𝑊𝑁
𝑚=0 𝑚=0
𝑁/2−1 𝑁/2−1
𝑘𝑚 𝑘𝑚
= ෍ 𝑥𝑒 [𝑚] 𝑊𝑁/2 + 𝑊𝑁𝑘 ෍ 𝑥𝑜 [𝑚] 𝑊𝑁/2 = 𝐺[𝑘] + 𝑊𝑁𝑘 𝐻[𝑘], 𝑘 = 0, 1, . . , 𝑁 − 1
𝑚=0 𝑚=0
Therefore, an 𝑁−point DFT can be decomposed into Two 𝑁/2−point DFTs.
𝑁/2−1 𝑁 𝑁/2−1
𝑁 (𝑘+ )𝑚 𝑘𝑚 𝑚𝑁/2
Note 𝐺[𝑘 + ] = ෍ 𝑥𝑒 [𝑚] 𝑊𝑁/2 2 = ෍ 𝑥𝑒 [𝑚] 𝑊𝑁/2 𝑊𝑁/2 = 𝐺[𝑘]
2
𝑚=0 𝑚=0
𝑁 𝑁 𝑁 𝑁2 𝑁 𝑁 𝑁
Complex multiplications: 2 × × + = + , Complex additions: × [2( − 1) + 1]
2024/6/4
2 2 2 2 2 2 2

10 Decimation in Time FFT (N = 8)

Flow graph of the decimation-in-time decomposition of an 𝑁-point 2024/6/4

DFT computation into two (𝑁/2)-point DFT computations (𝑁 = 8).

5
11 DIT FFT
𝑁/2−1 𝑁/2−1
𝑘𝑚 𝑘𝑚
𝑋[𝑘] = ෍ 𝑥𝑒 [𝑚] 𝑊𝑁/2 + 𝑊𝑁𝑘 ෍ 𝑥𝑜 [𝑚] 𝑊𝑁/2 = 𝐺[𝑘] + 𝑊𝑁𝑘 𝐻[𝑘], 𝑘 = 0,1, . . , 𝑁 − 1
𝑚=0 𝑚=0
𝐺[𝑘 + 𝑁/2] = 𝐺[𝑘], 𝐻[𝑘 + 𝑁/2] = 𝐻[𝑘] (𝑘 = 0,1, . . , 𝑁/2 − 1)
𝑁/2−1 𝑁/4−1 𝑁/4−1
𝑘𝑚 𝑘2𝑙 𝑘(2𝑙+1)
𝐺[𝑘] = ෍ 𝑥𝑒 [𝑚] 𝑊𝑁/2 = ෍ 𝑥𝑒 [2𝑙] 𝑊𝑁/2 + ෍ 𝑥𝑒 [2𝑙 + 1] 𝑊𝑁/2
𝑚=0 𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1
𝑘𝑙 𝑘 𝑘𝑙
= ෍ 𝑥𝑒 [2𝑙] 𝑊𝑁/4 + 𝑊𝑁/2 ෍ 𝑥𝑒 [2𝑙 + 1] 𝑊𝑁/4
𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1
𝑘𝑙 𝑘𝑙
= ෍ 𝑥[4𝑙] 𝑊𝑁/4 + 𝑊𝑁2𝑘 ෍ 𝑥[4𝑙 + 2] 𝑊𝑁/4 = 𝐺1 [𝑘] + 𝑊𝑁2𝑘 𝐺2 [𝑘]
𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1 𝑁/4−1
(𝑘+𝑁/4)𝑙 𝑘𝑙 𝑁𝑙/4 𝑘𝑙
𝐺1 [𝑘 + 𝑁/4] = ෍ 𝑥[4𝑙] 𝑊𝑁/4 = ෍ 𝑥[4𝑙] 𝑊𝑁/4 𝑊𝑁/4 = ෍ 𝑥[4𝑙] 𝑊𝑁/4 = 𝐺1 [𝑘]
𝑙=0 𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1
(𝑘+𝑁/4)𝑙 𝑘𝑙 𝑁𝑙/4
𝐺2 [𝑘 + 𝑁/4] = ෍ 𝑥[4𝑙 + 2] 𝑊𝑁/4 = ෍ 𝑥[4𝑙 + 2] 𝑊𝑁/4 𝑊𝑁/4 = 𝐺2 [𝑘]
𝑙=0 𝑙=0
2024/6/4
∴ for 𝐺1 [𝑘], 𝐺2 [𝑘], 𝑘 = 0,1, . . . , 𝑁/4 − 1

12 DIT FFT
𝑁/2−1 𝑁/2−1
𝑘𝑚 𝑘𝑚
𝑋[𝑘] = ෍ 𝑥𝑒 [𝑚] 𝑊𝑁/2 + 𝑊𝑁𝑘 ෍ 𝑥𝑜 [𝑚] 𝑊𝑁/2 = 𝐺[𝑘] + 𝑊𝑁𝑘 𝐻[𝑘], 𝑘 = 0,1, . . , 𝑁 − 1
𝑚=0 𝑚=0
𝐺[𝑘 + 𝑁/2] = 𝐺[𝑘], 𝐻[𝑘 + 𝑁/2] = 𝐻[𝑘] (𝑘 = 0,1, . . , 𝑁/2 − 1)
𝑁/2−1 𝑁/4−1 𝑁/4−1
𝑘𝑚 𝑘2𝑙 𝑘(2𝑙+1)
𝐻[𝑘] = ෍ 𝑥𝑜 [𝑚] 𝑊𝑁/2 = ෍ 𝑥𝑜 [2𝑙] 𝑊𝑁/2 + ෍ 𝑥𝑜 [2𝑙 + 1] 𝑊𝑁/2
𝑚=0 𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1
𝑘𝑙 𝑘 𝑘𝑙
= ෍ 𝑥𝑜 [2𝑙] 𝑊𝑁/4 + 𝑊𝑁/2 ෍ 𝑥𝑜 [2𝑙 + 1] 𝑊𝑁/4
𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1
𝑘𝑙 𝑘𝑙
= ෍ 𝑥[4𝑙 + 1] 𝑊𝑁/4 + 𝑊𝑁2𝑘 ෍ 𝑥[4𝑙 + 3] 𝑊𝑁/4 = 𝐻1 [𝑘] + 𝑊𝑁2𝑘 𝐻2[𝑘]
𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1
(𝑘+𝑁/4)𝑙 𝑘𝑙 𝑁𝑙/4
𝐻1 [𝑘 + 𝑁/4] = ෍ 𝑥[4𝑙 + 1] 𝑊𝑁/4 = ෍ 𝑥[4𝑙 + 1] 𝑊𝑁/4 𝑊𝑁/4 = 𝐻1 [𝑘]
𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1
(𝑘+𝑁/4)𝑙 𝑘𝑙 𝑁𝑙/4
𝐻2 [𝑘 + 𝑁/4] = ෍ 𝑥[4𝑙 + 3] 𝑊𝑁/4 = ෍ 𝑥[4𝑙 + 3] 𝑊𝑁/4 𝑊𝑁/4 = 𝐻2 [𝑘]
𝑙=0 𝑙=0
∴ for 𝐻1 [𝑘], 𝐻2 [𝑘], 𝑘 = 0,1, . . . , 𝑁/4 − 1 2024/6/4

6
13 Decimation in Time FFT (𝑁 = 8) (1/3)

2024/6/4

14 Decimation in Time FFT (𝑁 = 8) (2/3)


G1[0] G[0]

G1[1] G[1]

G2[0] G[2]

G2[1]
G[3]
H1[0] H[0]

H1[1] H[1]

H2[0] H[2]

H2[1] H[3] 2024/6/4

7
15 Decimation in Time FFT (𝑁 = 8) (3/3)
𝑁/4−point DFT: 𝑘 = 0, 𝑁/4 − 1
𝑁/4−1 𝑁/4−1
𝑘𝑙 𝑘𝑙 𝑘
𝐺1 [𝑘] = ෍ 𝑥𝑒 [2𝑙] 𝑊𝑁/4 = ෍ 𝑥[4𝑙] 𝑊𝑁/4 = 𝑥[0] + 𝑊𝑁/4 𝑥[4] = 𝑥[0] + 𝑊𝑁4𝑘 𝑥[4]
𝑙=0 𝑙=0
𝑁/4−1 𝑁/4−1
𝑘𝑙 𝑘𝑙 𝑘
𝐻1 [𝑘] = ෍ 𝑥𝑜 [2𝑙] 𝑊𝑁/4 = ෍ 𝑥[4𝑙 + 1] 𝑊𝑁/4 = 𝑥[1] + 𝑊𝑁/4 𝑥[5] = 𝑥[1] + 𝑊𝑁4𝑘 𝑥[5]
𝑙=0 𝑙=0

G1[0]

G1[1]
2024/6/4

16 Complete 8-point DIT FFT

2024/6/4

8
17 Basic & Simplified Butterfly in DIT FFT
Flow graph of simplified
Flow graph of basic butterfly butterfly computation
computation in complete 8-pt FFT requiring only one
complex multiplication

Note: 1 complex MUL


2 complex MUL 𝑁/2
𝑊𝑁 = 𝑒 𝑗𝜋 = −1 1 complex Add
2 complex Add
𝑊𝑁
𝑟+𝑁/2
= −𝑊𝑁𝑟 1 complex Sub
2024/6/4

𝑊𝑁0 = 1

18 Complete 8-point DIT FFT

Complex MUL: 4 × log28 = 12 2024/6/4

Complex Add: 4 × log28 = 12, Complex Sub: 4 × log28 = 12

9
19 In-Place Computations
For DIT FFT, if the first stage is bit-reverse ordered first as
stage 0, successive stages can use normal order & the
same storage to store their intermediate values.
𝑚 = 1,2, . . log2𝑁
𝑟 = 2𝑁−𝑚(0, . . , 2𝑚−1−1)
p
q

2024/6/4

20 Tree Diagram for Normal-Order Sorting

2024/6/4

10
21 Tree Diagram for Bit-Reversed Sorting

2024/6/4

22 Alternative Forms
One can rearrange nodes of original DIT FFT:

2024/6/4

𝑋0[ ] 𝑋1[ ] 𝑋2[ ] 𝑋3[ ]

11
23
Input in Normal Order and Output in Bit-Reversed
Order (Cooley & Tukey 1965)

2024/6/4

In place computation but need RAM storage!

24 Both Input and Output in Normal Order

2024/6/4

No in place computation & need RAM storage!

12
25 Same Geometry for Each Stage (DIT)

No in place computation but sequential data storage and access!

26 Radix-2 FFT Algorithm


Complex Complex
multiplications in multiplications in
For 𝑁 = 2𝜐 , this decimation can N Direct Computation FFT algorithm,
be performed 𝜐 = log2𝑁 times. N2 (N/2)log2N
4 16 4

The total number of 8 64 12

16 256 32
Complex multiplications : (𝑁/2)  log2𝑁
32 1,024 80
Complex additions : 𝑁  log2𝑁
64 4,096 192

128 16,384 448

256 65,535 1,024

512 262,144 2,304

1,024 1,048,576 5,120


2024/6/4

13
27 Decimation in Frequency (DIF) FFT (1/5)
𝑁−1 𝑁−1

𝑋[𝑘] = ෍ 𝑥[𝑛]𝑒 −𝑗2𝜋𝑘𝑛/𝑁 = ෍ 𝑥[𝑛]𝑊𝑁𝑘𝑛 , 𝑘 = 0, 1, . . , 𝑁 − 1 , 𝑘 = 2𝑟 or 2𝑟 + 1


𝑛=0 𝑛=0
𝑁−1 𝑁/2−1 𝑁−1
𝑛(2𝑟) 𝑛(2𝑟) 𝑛(2𝑟)
𝑋[2𝑟] = ෍ 𝑥[𝑛] 𝑊𝑁 = ෍ 𝑥[𝑛] 𝑊𝑁 + ෍ 𝑥[𝑛] 𝑊𝑁 , 𝑟 = 0, 1, . . , 𝑁/2 − 1
𝑛=0 𝑛=0 𝑛=𝑁/2
𝑁/2−1 𝑁/2−1 𝑁/2−1
2𝑟(𝑛+𝑁/2)
= ෍ 𝑥[𝑛] 𝑊𝑁2𝑟𝑛 + ෍ 𝑥[𝑛 + 𝑁/2] 𝑊𝑁 = ෍ (𝑥[𝑛] + 𝑥[𝑛 + 𝑁/2]) 𝑊𝑁2𝑟𝑛
𝑛=0 𝑛=0 𝑛=0
𝑁/2−1
𝑟𝑛
= ෍ 𝑔[𝑛] 𝑊𝑁/2 𝑔[𝑛] = (𝑥[𝑛] + 𝑥[𝑛 + 𝑁/2]), 𝑛 = 0,1, . . , 𝑁/2 − 1
𝑛=0
𝑁−1 𝑁/2−1 𝑁−1
𝑛(2𝑟+1) 𝑛(2𝑟+1) 𝑛(2𝑟+1)
𝑋[2𝑟 + 1] = ෍ 𝑥[𝑛] 𝑊𝑁 = ෍ 𝑥[𝑛] 𝑊𝑁 + ෍ 𝑥[𝑛] 𝑊𝑁 , 𝑟 = 0, 1, . . , 𝑁/2 − 1
𝑛=0 𝑛=0 𝑛=𝑁/2
𝑁/2−1 𝑁/2−1
𝑛(2𝑟+1) (𝑛+𝑁/2)(2𝑟+1) (𝑁/2)(2𝑟+1) (𝑁/2)
= ෍ 𝑥[𝑛] 𝑊𝑁 + ෍ 𝑥[𝑛 + 𝑁/2] 𝑊𝑁 (𝑊𝑁 = 𝑊𝑁 𝑊𝑁𝑁𝑟 = −1)
𝑛=0 𝑛=0
𝑁/2−1 𝑁/2−1
𝑛(2𝑟+1) 𝑟𝑛
= ෍ (𝑥[𝑛] − 𝑥[𝑛 + 𝑁/2]) 𝑊𝑁 = ෍ 𝑊𝑁𝑛 (𝑥[𝑛] − 𝑥[𝑛 + 𝑁/2]) 𝑊𝑁/2
𝑛=0 𝑛=0
𝑁/2−1
2024/6/4
𝑟𝑛
= ෍ 𝑊𝑁𝑛 ℎ[𝑛] 𝑊𝑁/2 ℎ[𝑛] = (𝑥[𝑛] − 𝑥[𝑛 + 𝑁/2]), 𝑛 = 0,1, . . , 𝑁/2 − 1
𝑛=0

28 Decimation in Frequency (DIF) FFT (2/5)

𝑋[2𝑟]

𝑋[2𝑟 + 1]

𝑔[𝑛] = 𝑥[𝑛] + 𝑥[𝑛 + 𝑁/2] 2024/6/4

ℎ[𝑛] = 𝑥[𝑛] – 𝑥[𝑛 + 𝑁/2]

14
29 Decimation in Frequency (DIF) FFT (3/5)
𝑁/2−1
𝑟𝑛
𝑟 = 2𝑠 or 2𝑠 + 1: 𝑋[2𝑟] = ෍ 𝑔[𝑛]𝑊𝑁/2 , 𝑟 = 0,1, . . , 𝑁/2 − 1, 𝑠 = 0,1, . . , 𝑁/4 − 1
𝑛=0
𝑁/2−1 𝑁/4−1 𝑁/4−1
(2𝑠)𝑛 (2𝑠)𝑛 (2𝑠)(𝑛+𝑁/4)
𝑋[2(2𝑠)] = ෍ 𝑔[𝑛]𝑊𝑁/2 = ෍ 𝑔[𝑛]𝑊𝑁/2 + ෍ 𝑔[𝑛 + 𝑁/4]𝑊𝑁/2
𝑛=0 𝑛=0 𝑛=0
𝑁/4−1 𝑁/4−1
2𝑠𝑛 𝑟𝑛
= ෍ 𝑔[𝑛] + 𝑔[𝑛 + 𝑁/4] 𝑊𝑁/2 = ෍ 𝑔1 [𝑛] 𝑊𝑁/4 𝑔1 [𝑛] = 𝑔[𝑛] + 𝑔[𝑛 + 𝑁/4]
𝑛=0 𝑛=0
𝑁/2−1 𝑁/4−1 𝑁/4−1
(2𝑠+1)𝑛 (2𝑠+1)𝑛 (2𝑠+1)(𝑛+𝑁/4)
𝑋[2(2𝑠 + 1)] = ෍ 𝑔[𝑛]𝑊𝑁/2 = ෍ 𝑔[𝑛]𝑊𝑁/2 + ෍ 𝑔[𝑛 + 𝑁/4]𝑊𝑁/2
𝑛=0 𝑛=0 𝑛=0
𝑁/4−1
𝑛 (2𝑠+1)𝑁/4 2𝑠𝑛
= ෍ 𝑊𝑁/2 𝑔[𝑛] + 𝑊𝑁/2 𝑔[𝑛 + 𝑁/4] 𝑊𝑁/2
𝑛=0
𝑁/4−1
𝑛 2𝑠𝑛
= ෍ 𝑊𝑁/2 𝑔[𝑛] − 𝑔[𝑛 + 𝑁/4] 𝑊𝑁/2
𝑛=0
𝑁/4−1
𝑟𝑛
= ෍ 𝑊𝑁2𝑛 𝑔2 [𝑛]𝑊𝑁/4 𝑔2 [𝑛] = 𝑔[𝑛] − 𝑔[𝑛 + 𝑁/4] 2024/6/4
𝑛=0

30 Decimation in Frequency (DIF) FFT (4/5)


𝑁/2−1
𝑟𝑛
𝑟 = 2𝑠 or 2𝑠 + 1: 𝑋[2𝑟 + 1] = ෍ 𝑊𝑁𝑛 ℎ[𝑛] 𝑊𝑁/2
𝑛=0
𝑁/2−1 𝑁/4−1 𝑁/4−1
(2𝑠)𝑛 (2𝑠)𝑛 𝑛+𝑁/4 (2𝑠)(𝑛+𝑁/4)
𝑋[2(2𝑠) + 1] = ෍ 𝑊𝑁𝑛 ℎ[𝑛] 𝑊𝑁/2 = ෍ 𝑊𝑁𝑛 ℎ[𝑛] 𝑊𝑁/2 + ෍ 𝑊𝑁 ℎ[𝑛 + 𝑁/4] 𝑊𝑁/2
𝑛=0 𝑛=0 𝑛=0
𝑁/4−1
𝑛+𝑁/4 𝑠𝑛
= ෍ (𝑊𝑁𝑛 ℎ[𝑛] + 𝑊𝑁 ℎ[𝑛 + 𝑁/4]) 𝑊𝑁/4
𝑛=0
𝑁/4−1
𝑠𝑛 𝑛+𝑁/4
= ෍ ℎ1 [𝑛] 𝑊𝑁/4 ℎ1 [𝑛] = 𝑊𝑁𝑛 ℎ[𝑛] + 𝑊𝑁 ℎ[𝑛 + 𝑁/4]
𝑛=0
𝑁/2−1 𝑁/4−1 𝑁/4−1
(2𝑠+1)𝑛 𝑛(2𝑠+1) (𝑛+𝑁/4) (𝑛+𝑁/4)(2𝑠+1)
𝑋[2(2𝑠 + 1) + 1] = ෍ 𝑊𝑁𝑛 ℎ[𝑛] 𝑊𝑁/2 = ෍ 𝑊𝑁𝑛 ℎ[𝑛] 𝑊𝑁/2 + ෍ 𝑊𝑁 ℎ[𝑛 + 𝑁/4] 𝑊𝑁/2
𝑛=0 𝑛=0 𝑛=0
𝑁/4−1 𝑁/4−1
𝑛(2𝑠+1) (𝑛+𝑁/4) 𝑛(2𝑠+1)+𝑁/4(2𝑠+1) (𝑁/4)(2𝑠+1) (𝑁/2)
= ෍ 𝑊𝑁𝑛 ℎ[𝑛] 𝑊𝑁/2 + ෍ 𝑊𝑁 ℎ[𝑛 + 𝑁/2] 𝑊𝑁/2 (𝑊𝑁/2 = 𝑊𝑁 𝑊𝑁𝑁𝑠 = −1)
𝑛=0 𝑛=0
𝑁/4−1 𝑁/2−1
(𝑛+𝑁/4) 𝑛(2𝑟+1) (𝑛+𝑁/4) 𝑠𝑛
= ෍ (𝑊𝑁𝑛 ℎ[𝑛] − 𝑊𝑁 ℎ[𝑛 + 𝑁/2]) 𝑊𝑁/2 = ෍ 𝑊𝑁2𝑛 (𝑊𝑁𝑛 ℎ[𝑛] − 𝑊𝑁 ℎ[𝑛 + 𝑁/2]) 𝑊𝑁/4
𝑛=0 𝑛=0
𝑁/4−1
𝑟𝑛 (𝑛+𝑁/4) 2024/6/4
= ෍ 𝑊𝑁2𝑛 ℎ2 [𝑛] 𝑊𝑁/2 ℎ2 [𝑛] = 𝑊𝑁𝑛 ℎ[𝑛] − 𝑊𝑁 ℎ[𝑛 + 𝑁/2]
𝑛=0

15
31 Decimation in Frequency (DIF) FFT (5/5)

𝑋[2(2𝑠)]

𝑋[2(2𝑠 + 1)]

𝑋[2(2𝑠) + 1]

𝑋[2(2𝑠 + 1) + 1]

𝑟 = 0, 1, 2, 3 𝑠 = 0, 1
2024/6/4

32 Butterfly in DIF FFT (N = 8) (1/2)

Flow graph of 2-pt (𝑁/4-pt ) DFT required in the


last stage of 𝑁-pt (8-pt) DIF FFT: a Butterfly

2024/6/4

16
33 Butterfly in DIF FFT (N = 8) (2/2)

𝑋[2(2𝑠)]

𝑋[2(2𝑠 + 1)]

𝑋[2 2𝑠 + 1]

𝑋[2(2𝑠 + 1) + 1]
2024/6/4

34 Typical Butterfly in DIF FFT

𝑋𝑚 𝑝 = 𝑋𝑚−1 𝑝 + 𝑋𝑚−1 𝑞
𝑋𝑚 𝑞 = 𝑋𝑚−1 𝑝 − 𝑋𝑚−1 𝑞 𝑊𝑁𝑟

2024/6/4

17
35
Input in Bit-Reversed Order and Output in Normal
Order in DIF FFT

2024/6/4

In place computation but need RAM storage!

36 Same Geometry for Each Stage (DIF)

2024/6/4

No in place computation but sequential data storage and access!

18
37 Indexing (1/2)
◼ In DIT FFT, the input sequence is first sorted into bit-reverse order.
◼ Bit-reverse sorting can be done in place, since samples are only
pairwise interchanged, i.e., a sample at a given index is
interchanged with the sample in the location specified by the
bit-reversed index.
◼ This is conveniently done in place by using two counters, one in
normal order and the other in bit-reverse order. The data in the
two positions specified by the two counters are simply
interchanged.
◼ Once the input is in bit-reverse order and stored in 𝑋0[], we can
proceed with the first stage.
2024/6/4

38 Indexing (2/2)
◼ In the first stage, the input to each butterflies are adjacent
elements in 𝑋0[∙].
◼ In the second stage, the inputs to the butterflies are
separated by 2.
◼ In the 𝑚-th stage, the inputs to the butterflies are separated
by 2𝑚−1 , and in this case the coefficients are powers of
𝑁/2𝑚
𝑊𝑁 and are required in normal order if the computation
of butterflies begins at the top of the flow graph.
◼ The above statements define the manner in which data must
be accessed at a given stage, which depends on the flow
graph that it is implemented.
2024/6/4

19
39 Implementation of DFT Using Convolution
◼ Winograd Fourier Transform Algorithm (WFTA)
❑ Achieves its efficiency by expressing the DFT in terms of polynomial
multiplication, or equivalently, convolution
❑ WFTA uses an indexing scheme corresponding to the

decomposition of the DFT into a multiplicity of short-length DFTs


where the lengths are relatively prime
❑ With the WFTA, the number of multiplications required for an 𝑁-

point DFT is proportional to 𝑁, not 𝑁 log 𝑁. However, the number of


additions is significantly increased in comparison with the FFT
◼ The Chirp Transform
❑ CTA uses convolution with a fixed, prespecified impulse response
to compute any samples of DTFT
2024/6/4

40 Frequency Samples for Chirp Transform Algorithm


𝑥[𝑛]: N−point sequence and
𝑋(𝑒 𝑗𝜔 ) = DTFT{ 𝑥[𝑛]}
𝜔𝑘 = 𝜔0 + 𝑘Δ𝜔, 𝑘 = 0,1,2, . . . , 𝑀 − 1
𝑁−1 𝑁−1

𝑋(𝑒 𝑗𝜔𝑘 ) = ෍ 𝑥[𝑛]𝑒 −𝑗𝜔𝑘 𝑛 = ෍ 𝑥[𝑛]𝑒 −𝑗(𝜔0 +𝑘Δ𝜔)𝑛


𝑛=0 𝑛=0
𝑁−1 𝑁−1

= ෍ 𝑥[𝑛]𝑒 −𝑗𝜔0 𝑛 𝑒 −𝑗(Δ𝜔)𝑘𝑛 = ෍ 𝑥[𝑛]𝑒 −𝑗𝜔0 𝑛 𝑊 𝑛𝑘


𝑛=0 𝑛=0
𝑁−1
1 2 2 2
= ෍ 𝑥[𝑛]𝑒 −𝑗𝜔0 𝑛
𝑊 2(𝑛 +𝑘 −(𝑘−𝑛) )
𝑛=0
𝑁−1
1 2 1 2 1 2
= ෍ {𝑥[𝑛]𝑒 −𝑗𝜔0 𝑛 𝑊 2𝑛 }𝑊 2𝑘 𝑊 −2(𝑘−𝑛)
𝑛=0
𝑁−1
1 2 1 2 1 2
𝑁−1 = 𝑊 2𝑘 ෍ {𝑥[𝑛]𝑒 −𝑗𝜔0 𝑛 𝑊 2𝑛 }𝑊 −2(𝑘−𝑛)
2 /2 2 𝑛=0
𝑋(𝑒 𝑗𝜔𝑛 ) = 𝑊 𝑛 ෍ 𝑔[𝑘]𝑊 −(𝑛−𝑘) 1 2
𝑁−1
1 2
𝑘=0 = 𝑊 2𝑘 ෍ 𝑔[𝑛]𝑊 −2(𝑘−𝑛)
2024/6/4
𝑛 = 0,1,2, . . . , 𝑀 − 1 𝑛=0

20
41 Chirp Transform Algorithm
2 /2 2 /2
𝑔[𝑛] = 𝑥[𝑛]𝑒 −𝑗𝜔0 𝑛 𝑊 𝑛 = 𝑥[𝑛]𝑒 −𝑗𝜔0 𝑛 (𝑒 −𝑗Δ𝜔 )𝑛 , 𝑛 = 0,1,2, . . , 𝑁 − 1
𝑁−1 𝑁−1
2 2 /2
𝑋(𝑒 𝑗𝜔𝑛 ) =෍ 𝑥[𝑘]𝑒 −𝑗𝜔𝑛 𝑘 = 𝑊 𝑛 /2 ෍ 𝑔[𝑘]𝑊 −(𝑛−𝑘)
𝑘=0 𝑘=0

2 /2 2 /2
= 𝑊𝑛 ⋅ 𝑔[𝑛] ∗ 𝑊 −𝑛 , 𝑛 = 0,1,2, . . . , 𝑀 − 1

2 /2 2 /2 2 Δ𝜔/2
𝑊 −𝑛 = (𝑒 −𝑗Δ𝜔 )−𝑛 = 𝑒 𝑗𝑛 = 𝑒 𝑗(𝑛Δ𝜔/2)𝑛 = chirp signal, infinite duration

2024/6/4

42 CTA Example (1/3)

−(N−1) (M−1)

2024/6/4

21
43 CTA Example (2/3)
2 /2 2 /2
𝑔[𝑛] = 𝑥[𝑛]𝑒 −𝑗𝑤0 𝑛 𝑊 𝑛 = 𝑥[𝑛]𝑒 −𝑗𝑤0𝑛 (𝑒 −𝑗Δ𝑤 )𝑛 , 𝑛 = 0,1, . . , 𝑁 − 1
𝑁−1
2 /2 2 /2 2 /2 2 /2
𝑋(𝑒 𝑗𝑤𝑛 ) = 𝑊 𝑛 ෍ 𝑔[𝑘]𝑊 −(𝑛−𝑘) = 𝑊𝑛 ⋅ 𝑔[𝑛] ∗ 𝑊 −𝑛 , 𝑛 = 0,1,2, . . . , 𝑀 − 1
𝑘=0
2 2 2
𝑊 −𝑛 /2 = (𝑒 −𝑗Δ𝑤 ) /2
−𝑛 = 𝑒 𝑗𝑛 Δ𝑤/2 = 𝑒 𝑗(𝑛Δ𝑤/2)𝑛 = chirp signal, infinite duration
−𝑛 2 /2
⇒𝑊 is needed from only − (𝑁 + 1) 𝑡𝑜 (𝑀 − 1) in computing 𝑋(𝑒 𝑗𝑤𝑛 )
−𝑛 /2 2
⇒ Define a noncausal FIR = ℎ[𝑛] = ቊ𝑊 −(𝑁 + 1) ≤ 𝑛 ≤ (𝑀 − 1) ,
0 otherwise
2
⇒ 𝑋(𝑒 𝑗𝑤𝑛 ) = 𝑦[𝑛] = 𝑊 𝑛 /2 ⋅ (𝑔[𝑛] ∗ ℎ[𝑛]), 𝑛 = 0,1,2, . . . , 𝑀 − 1

2024/6/4

44 CTA Example (3/3)


−𝑛2 /2
Now ℎ[𝑛] noncausal FIR: ℎ[𝑛] = ቊ𝑊 −(𝑁 − 1) ≤ 𝑛 ≤ (𝑀 − 1) ,
0 otherwise
2 2
and 𝑋(𝑒 𝑗𝜔𝑛 ) = 𝑦[𝑛] = 𝑊 𝑛 /2 ⋅ (𝑔[𝑛] ∗ ℎ[𝑛]) = 𝑊 𝑛 /2 ⋅ 𝑝[𝑛], 𝑛 = 0,1,2, . . . , 𝑀 − 1
We can obtain a causal FIR system by delaying ℎ[𝑛] (𝑁 − 1) samples:
2
−(𝑛−𝑁+1) /2 0 ≤ 𝑛 ≤ (𝑁 + 𝑀 − 2)
ℎ1 [𝑛] = ቊ𝑊
0 otherwise
then 𝑔[𝑛] ∗ ℎ1 [𝑛] = 𝑝[𝑛 − 𝑁 + 1] for 𝑛 = 𝑁 − 1, 𝑁, . . . , 𝑁 + 𝑀 − 2
2
Let 𝑦1 [𝑛] = 𝑊 (𝑛−𝑁+1) /2 𝑔[𝑛] ∗ ℎ1 [𝑛] , then
2 2
𝑦1 [𝑛 + 𝑁 − 1] = 𝑊 ((𝑛+𝑁−1)−𝑁+1) /2 ⋅ 𝑝[(𝑛 + 𝑁 − 1) − 𝑁 + 1] = 𝑊 𝑛 /2 ⋅ 𝑝[𝑛] = 𝑋(𝑒 𝑗𝜔𝑛 )

2024/6/4

22
45 CTA for Causal FIR (1/3)
◼ Involves the convolution of the input signal (modulated with
a chirp) with a fixed, causal impulse response
◼ Suitable for real-time implementation
◼ Certain technologies, such as CCD and SAW devices are
particularly useful for implementing convolution with a fixed,
prespecified impulse response

2024/6/4

46 CTA for Causal FIR (2/3)


𝜔0 = 0, 𝜔𝑛 = 2𝜋𝑛/𝑁, 𝑊 = 𝑒 −𝑗2𝜋/𝑁 = 𝑊𝑁
◼ Further simplification of the 2 /2 −𝑛2 /2 −𝑁2 /2
𝑊 −(𝑛−𝑁) = 𝑊𝑁 𝑊𝑁𝑛𝑁 𝑊𝑁
CTA results when the −𝑛2 /2
frequency samples = 𝑊𝑁 for 𝑁 even
2
𝑗𝜔 ) = 𝑦[𝑛] = 𝑊 𝑛 /2 ⋅ (𝑔[𝑛] ∗ ℎ[𝑛])
correspond to the DFT, i.e., 𝑋(𝑒 𝑛
2
when 𝜔0 = 0, and 𝑊𝑁 = = 𝑊 𝑛 /2 ⋅ 𝑝[𝑛], 𝑛 = 0,1,2, . . . , 𝑀 − 1
Apply additional delay to ℎ1 [𝑛] & for 𝑁 even:
𝑒 −𝑗2𝜋/𝑁 , so that 𝜔𝑛 = 2𝜋𝑛/𝑁 −𝑛2 /2
ℎ2 [𝑛] = ൝𝑊𝑁 1 ≤ 𝑛 ≤ (𝑁 + 𝑀 − 1)
0 otherwise
⇒ 𝑔[𝑛] ∗ ℎ2 [𝑛] = 𝑝[𝑛 − 𝑁] for 𝑛
= 𝑁, . . , 𝑁 + 𝑀 − 1
2
Let 𝑦2 [𝑛] = 𝑊 (𝑛−𝑁) /2 𝑔[𝑛] ∗ ℎ2 [𝑛] , then
2
𝑦2 [𝑛 + 𝑁] = 𝑊 ((𝑛+𝑁)−𝑁) /2 ⋅ 𝑝[(𝑛 + 𝑁) − 𝑁]
2
= 𝑊 𝑛 /2 ⋅ 𝑝[𝑛] = 𝑋(𝑒 𝑗𝜔𝑛 )
2024/6/4

23
47 CTA for Causal FIR (3/3)
◼ 𝑥 𝑛 is nonzero only for 0 ≤ n ≤ 25, and we wish to compute 16
2𝜋 2𝜋𝑘
samples of the DTFT 𝑋 𝑒 𝑗𝜔 at 𝜔𝑘 = 27 + 1024 for 𝑘 = 0, 1, ⋯ , 15

2𝜋 2π 2π
⇒ 𝑀 = 16, 𝑁 = 26, 𝜔0 = , 𝜔𝑘 = + 𝑘, 𝑘 = 0,1, . . . , 15
27 27 1024
2π 2π
⇒ Δ𝜔 = , 𝑊 = 𝑒 −𝑗Δ𝜔 = 𝑒 −𝑗1024
1024
−(𝑛−25)2 /2
2 2π
−𝑗
⇒ ℎ1 [𝑛] = ቐ𝑊 −(𝑛−𝑁+1) /2= 𝑒 1024 0 ≤ 𝑛 ≤ (𝑁 + 𝑀 − 2) = 40
0 otherwise
(𝑛−𝑁+1)2 /2
𝑦1 [𝑛] = 𝑊 𝑔[𝑛] ∗ ℎ1 [𝑛] , and
2π 2π
𝑦1 [𝑛 + 𝑁 − 1] = 𝑋(𝑒 𝑗𝜔𝑛 ) ቤ𝜔𝑛 = + 𝑛, 𝑛 = 0,1, . . . , 15
27 1024
2024/6/4

24

You might also like