8000 Problems with multiprocessing and numpy.fft · Issue #8140 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content
Problems with multiprocessing and numpy.fft #8140
Closed
@tillahoffmann

Description

@tillahoffmann

I would like to compute a set of ffts in parallel using numpy.fft.fft and multiprocessing. Unfortunately, running the ffts in parallel results in a large kernel load.

Here is a minimal example that reproduces the problem:

import numpy as np
import scipy
import scipy.fftpack
import multiprocessing
from argparse import ArgumentParser


SIZE = 10000000


def f_numpy(i):
    x = np.empty(SIZE)
    np.fft.fft(x)
    return i


def f_scipy(i):
    x = np.empty(SIZE)
    scipy.fft(x)
    return i


def f_scipy_rfft(i):
    x = np.empty(SIZE)
    scipy.fftpack.rfft(x)
    return i


functions = {
    'numpy': f_numpy,
    'scipy': f_scipy,
    'scipy_rfft': f_scipy_rfft,
}


def __main__():
    ap = ArgumentParser('fft_test')
    ap.add_argument('--function', '-f', help='method used to calculate the fft', choices=functions, default='numpy')
    ap.add_argument('--single_core', '-s', action='store_true', help='use only a single core')
    ap.add_argument('--method', '-m', help='start method', choices=['fork', 'spawn'], default='fork')
    args = ap.parse_args()

    multiprocessing.set_start_method(args.method)

    # Show the configuration
    print("number of cores: %d" % multiprocessing.cpu_count())
    np.__config__.show()

    # Get the method
    f = functions[args.function]

    # Execute using a single core
    if args.single_core:
        for i in range(multiprocessing.cpu_count()):
            f(i)
            print(i, end=' ')
    # Execute using all cores
    else:
        pool = multiprocessing.Pool()
        for i in pool.map(f, range(multiprocessing.cpu_count())):
            print(i, end=' ')


if __name__ == '__main__':
    __main__()

Running time python fft_test.py gives me the following results:

number of cores: 48
openblas_info:
    library_dirs = ['/home/till/anaconda2/envs/sonalytic/lib']
    define_macros = [('HAVE_CBLAS', None)]
    libraries = ['openblas', 'openblas']
    language = c
openblas_lapack_info:
    library_dirs = ['/home/till/anaconda2/envs/sonalytic/lib']
    define_macros = [('HAVE_CBLAS', None)]
    libraries = ['openblas', 'openblas']
    language = c
blas_opt_info:
    library_dirs = ['/home/till/anaconda2/envs/sonalytic/lib']
    define_macros = [('HAVE_CBLAS', None)]
    libraries = ['openblas', 'openblas']
    language = c
blas_mkl_info:
  NOT AVAILABLE
lapack_opt_info:
    library_dirs = ['/home/till/anaconda2/envs/sonalytic/lib']
    define_macros = [('HAVE_CBLAS', None)]
    libraries = ['openblas', 'openblas']
    language = c

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 

real    0m7.422s
user    0m9.830s
sys 1m26.603s

Running with a single core, i.e. python fft_test.py -s gives

real    1m0.345s
user    0m56.558s
sys 0m2.959s

I thought that using spawn rather than fork might resolve the problem but I had no luck. Any idea what might cause the large kernel wait?

I originally posted this issue on stackoverflow but realised this may be a more appropriate place.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0