-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
NumPy "dot" hangs when used with multiprocessing (potentially Apple Accelerate related?) #5752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes, you basically can't use multiprocessing in fork mode if you're using Workarounds:
My guess is you could get a reliable reproducer by first dot'ing some
|
I'm guessing that you upgraded numpy recently, using wheels? The Numpy 1.9.0 wheel used ATLAS, and was safe for multiprocessing (but slower in general). 1.9.1 and 1.9.2 wheels use Accelerate again. |
@matthew-brett I don't think I remember the reason for that switch, can you remind me? |
I know that we're still in the shake down period on the wheel builds, but I
|
Accelerate |
Thanks for the insight, all. We'll have to do some thinking about a workaround. (Or wait for our institute to switch to Python 3.X...) I'm fine recompiling NumPy with a different linear algebra lib for my own use, but the project I'm working on is intended to be a 'pip install'-able package for anyone to use. We might have to detect the platform and fail with an exception in the case where using multiprocessing && Apple Accelerate. |
@matthew-brett Is speed the main reason not to use ATLAS for the main public wheel builds? I'm not sure how the needs of NumPy users break down, but I'd like to put in one vote for a more conservative / compatible wheel. |
I tend to agree. Just sent an email to the list to see if we can get
|
@ogrisel - any opinion on this one for sklearn? |
By the way - when I said it wasn't that hard to build ATLAS wheels, I meant, that we (numpy) could build some ATLAS wheels and supply them somewhere as an option. |
By default, Python multiprocessing does I issued a patch to OpenBLAS to make it's non-OpenMP thread pool fork safe in the past: It would be interesting to retry if this fix still works for the latest version of OpenBLAS. As OpenBLAS is quite fast we could use it for the wheels (if all numpy + scipy tests pass under OSX with OpenBLAS). ATLAS is robust by default too. In Python 3.4+, multiprocessing has a |
As a status update to @josePhoenix's original problem, I've just updated our code to use (And as a bonus this now gives us a concrete reason to tell the users of our library "hey, you should update to Python 3.4" :-) |
Can this be closed now? |
From our (@josePhoenix and my) perspective, I think yes it can be closed. The discussion above made it sound like there were broader interests in having ATLAS wheels as well as ones with Accelerate, but that probably ought to be its own separate issue if if's going to be pursued. That's up to you all! Thanks again. |
@mperrin OK, thanks for the feedback. |
Hi all, sorry, but is the only solution here to switch to python3? I am mid-project and I really need both np.dot() and multiprocessing. A bit concerned about switch to python3 mid project, any thoughts would be a huge help! |
From our experience with this bug: adding support for Python 3.4+ (with the 'forkserver' method) was relatively painless, and resolved this specific issue. Our codebase wasn't specifically written to be forward-compatible, but it was more or less best-practices Python 2.7 code which ported rather easily. If you try that, and it proves to be a morass, you could always do the poor man's multiprocessing: make your individual work units things that you can invoke with a |
So the recommendation is python3. We haven't done anything terribly exotic so hopefully we will have decent forward compatibility. Thanks |
The other option would be to build numpy against a different blas library. E.g., openblas (what we use by default on Windows) doesn't have this problem. |
Err, use by default on Linux I mean. And hopefully Windows soon too. |
Jumping on late -- but if you're comfortable with dummy processes, switching my Pool to a multiprocess.dummy Pool was shown to get around this issue. Found here: |
BTW, if you want to use thread pool I would recommend to use the more modern |
@ogrisel yeah totally! Definitely don't think it's a solution for all cases, but with mine (specifically just trying to get a "cancellable" python process) it worked well enough. Thanks for the tip! I'll look at switching over |
is this fixed? |
@zerodrift we switched our code to use the |
mine outright crashes...is it related? Process: Python [92939] Date/Time: 2017-01-24 11:25:40.057 -0600 Sleep/Wake UUID: 12153DD9-1ECD-4124-A226-A26A3DC3F874 Time Awake Since Boot: 410000 seconds System Integrity Protection: enabled Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_BAD_ACCESS (SIGSEGV) Termination Signal: Segmentation fault: 11 VM Regions Near 0x110: Application Specific Information: |
It can be related. For Python 2.7 users (and Python 3 users as well), we started an alternative module to safely work with sub-processes without fearing those kinds of crashes and hanging: https://github.com/tomMoral/loky It's still beta but we would be glad to get your feedback as github issues. |
…Still falls over from RBF_SVC and Ridge due to: numpy/numpy#5752
setting:
fixed the issue for me |
I'm having a devil of a time making a minimal test case, but this seems to be my issue: http://stackoverflow.com/questions/23963997/python-child-process-crashes-on-numpy-dot-if-pyside-is-imported
In my case, I have code that farms out a bunch of calculations, including matrix products to a
multiprocessing.Pool
withpool.map
. The computation hangs partway through, and some hackyprint
-based debugging shows it hanging on a call tonp.dot
down in the guts of the program.Replacing
pool.map
with the built-in (serial)map
makes everything work.I used not to have this issue, then something changed (lunar eclipse?) and now my computation hangs consistently whenever multiprocessing is used. A minimal test case continues to elude me. (It's not enough to simply generate 10 random NxN arrays and dot them in a
multiprocessing
-based way.)The text was updated successfully, but these errors were encountered: