8000 Merge pull request #11888 from mreineck/add_pocketfft · numpy/numpy@bae3cfa · GitHub
[go: up one dir, main page]

Skip to content

Commit bae3cfa

Browse files
authored
Merge pull request #11888 from mreineck/add_pocketfft
WIP: Add pocketfft sources to numpy for testing, benchmarks, etc.
2 parents 3abb5f8 + 0fe2a58 commit bae3cfa

12 files changed

+2514
-2179
lines changed

doc/release/1.17.0-notes.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ NumPy 1.17.0 Release Notes
66
Highlights
77
==========
88

9+
* NumPy's FFT implementation has switched to pocketfft
910

1011
New functions
1112
=============
@@ -38,6 +39,15 @@ New Features
3839
Improvements
3940
============
4041

42+
replacement of the `fftpack`-based FFT module by the `pocketfft` library
43+
------------------------------------------------------------------------
44+
Both implementations have the same ancestor (Fortran77 `FFTPACK` by Paul N.
45+
Swarztrauber), but `pocketfft` contains additional modifications which
46+
improve both accuracy and performance in some circumstances. For FFT lengths
47+
containing large prime factors, `pocketfft` uses Bluestein's algorithm, which
48+
maintains `O(N log N)` run time complexity instead of deteriorating towards
49+
`O(N*N)` for prime lengths. Also, accuracy for real-valued FFTs with near-prime
50+
lengths has improved and is on par with complex-valued FFTs.
4151

4252
Changes
4353
=======

numpy/fft/README.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
PocketFFT
2+
---------
3+
4+
This is a heavily modified implementation of FFTPack [1,2], with the following
5+
advantages:
6+
7+
- strictly C99 compliant
8+
- more accurate twiddle factor computation
9+
- very fast plan generation
10+
- worst case complexity for transform sizes with large prime factors is
11+
`N*log(N)`, because Bluestein's algorithm [3] is used for these cases.
12+
13+
License
14+
-------
15+
16+
3-clause BSD (see LICENSE.md)
17+
18+
19+
Some code details
20+
-----------------
21+
22+
Twiddle factor computation:
23+
24+
- making use of symmetries to reduce number of sin/cos evaluations
25+
- all angles are reduced to the range `[0; pi/4]` for higher accuracy
26+
- an adapted implementation of `sincospi()` is used, which actually computes
27+
`sin(x)` and `(cos(x)-1)`.
28+
- if `n` sin/cos pairs are required, the adjusted `sincospi()` is only called
29+
`2*sqrt(n)` times; the remaining values are obtained by evaluating the
30+
angle addition theorems in a numerically accurate way.
31+
32+
Parallel invocation:
33+
34+
- Plans only contain read-only data; all temporary arrays are allocated and
35+
deallocated during an individual FFT execution. This means that a single plan
36+
can be used in several threads at the same time.
37+
38+
Efficient codelets are available for the factors:
39+
40+
- 2, 3, 4, 5, 7, 11 for complex-valued FFTs
41+
- 2, 3, 4, 5 for real-valued FFTs
42+
43+
Larger prime factors are handled by somewhat less efficient, generic routines.
44+
45+
For lengths with very large prime factors, Bluestein's algorithm is used, and
46+
instead of an FFT of length `n`, a convolution of length `n2 >= 2*n-1`
47+
is performed, where `n2` is chosen to be highly composite.
48+
49+
50+
[1] Swarztrauber, P. 1982, Vectorizing the Fast Fourier Transforms
51+
(New York: Academic Press), 51
52+
[2] https://www.netlib.org/fftpack/
53+
[3] https://en.wikipedia.org/wiki/Chirp_Z-transform

numpy/fft/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
# To get sub-modules
44
from .info import __doc__
55

6-
from .fftpack import *
6+
from .pocketfft import *
77
from .helper import *
88

99
from numpy._pytesttester import PytestTester

0 commit comments

Comments
 (0)
0