Description
Description
I try to 30 fit times a Random Forest Regressor (sklearn.ensemble.RandomForestRegressor) on different sets of data.
In order to do it I use a for loop. But every-times, I run my script the python kernel died unexpectedly.
I run this script on different machines, the power of the machine (RAM and CPU) only delays the moment when the kernel die.
I write a minimal case of my script without my personal data and of the other thinks I want to do normally. In my complete and original script in which I read the data in a csv file and write the prediction in an other csv file, the kernel die even quickly.
Steps/Code to Reproduce
Example:
"""
@author: Nicolas
The goal of this script is too highlight a problem that make crash the python
kernel on Windows and Linux machine by using RandomForestRegressor
With scikit learn 0.18.
"""
import numpy as np
from sklearn.ensemble import RandomForestRegressor
def main():
NumberOfRandomForest = 30
# We create a random forest regressor
RFR = RandomForestRegressor(n_estimators=100, criterion='mae', max_depth=None,min_samples_split=2, min_samples_leaf=1)
print("Start the for loop")
for i in range(NumberOfRandomForest):
print(i)
X = np.random.rand(150, 30)
y = np.random.rand(150, 10)
RFR.fit(X,y)
if __name__ == "__main__":
main()
Versions
I try my script on two different machines.
On the first one the script crash after only 3 iterations whereas on the second the script can reach 25 iterations.
The setup of the first machine :
Windows-10-10.0.14393-SP0
Python 3.5.2 |Anaconda 4.2.0 (32-bit)| (default, Jul 5 2016, 11:45:57) [MSC v.1900 32 bit (Intel)]
NumPy 1.11.1
SciPy 0.18.1
Scikit-Learn 0.18
The setup of the second one :
Linux-3.10.0-327.28.2.el7.x86_64-x86_64-with-centos-7.2.1511-Core
Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 2 2016, 17:53:06)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
NumPy 1.11.1
SciPy 0.18.1
Scikit-Learn 0.18.1