-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Isolation forest final stage very slow and single threaded #13295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Perhaps related: #13260? |
@ngoix is it possible to run without calling score_sample? For my purposes I actually just need the decision function, not the full |
unfortunately it is not. |
I phrased it incorrectly, I'm just wondering if it's possible during |
You can achieve this by setting |
Perfect. Thanks! |
I'll close this, given solutions presented, work merged since and work under way... let me know if that is a mistake. |
Description
Isolation forest final stage very slow and single threaded.
This is an issue I get quite frequently. I'll train an isolation forest on a decently large data set (say order 1M to 100M records, around 50 features), and it will run rapidly and in parallel with nearly 100% CPU utilization. I'll get the output like the following:
And then it will run for a very long time (10x as long? more?) on a single core, and eventually finalize. Often I'll get progress statements all printed simultaneously at the end when the task completes:
I presume that's from parallel processes or threads printing to stdout without flushing.
I create the isolation forest with:
Versions
System:
python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0]
executable: /home/ibackus/anaconda3/bin/python
machine: Linux-4.15.0-1032-aws-x86_64-with-debian-buster-sid
BLAS:
macros: SCIPY_MKL_H=None, HAVE_CBLAS=None
lib_dirs: /home/ibackus/anaconda3/lib
cblas_libs: mkl_rt, pthread
Python deps:
pip: 18.1
setuptools: 40.6.3
sklearn: 0.20.2
numpy: 1.15.4
scipy: 1.2.1
Cython: 0.29.2
pandas: 0.24.1
The text was updated successfully, but these errors were encountered: