You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Memory leak causing system to freeze when using GridSearchCV.fit() with GradientBoostingClassifier with n_jobs > 1, cv=ShuffleSplit, and criterion = 'mae' in the param grid. Change the loss criterion from 'mae' to default and the problem goes away. I am using a tiny dataset (<1000 data points, 11 numeric predictors, binary target).
Steps/Code to Reproduce
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import ShuffleSplit
from skleran.model_selection import GridSearchCV
The text was updated successfully, but these errors were encountered:
MFranking
changed the title
Memory Leak with GradientBoostingClassifier, 'mae' criterion, and GridSearchCV
Memory Explosion with GradientBoostingClassifier, 'mae' criterion, and GridSearchCV
Jan 27, 2017
Description
Memory leak causing system to freeze when using GridSearchCV.fit() with GradientBoostingClassifier with n_jobs > 1, cv=ShuffleSplit, and criterion = 'mae' in the param grid. Change the loss criterion from 'mae' to default and the problem goes away. I am using a tiny dataset (<1000 data points, 11 numeric predictors, binary target).
Steps/Code to Reproduce
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import ShuffleSplit
from skleran.model_selection import GridSearchCV
ss = ShuffleSplit(n_splits = 50, test_size = 0.5)
gbc=GradientBoostingClassifier()
param_grid = {'n_estimators': [50], 'learning_rate': [0.05], 'criterion': ['mae'], 'max_depth': [1,2,3,4,5,6,7,8,9,10,15,40]}
clf = GridSearchCV(gbc,param_grid = param_grid, cv=ss,scoring='precision',n_jobs=4,pre_dispatch=4)
clf.fit(predictors, target)
Expected Results
This should just complete running easily with my tiny data set (<100KB).
Actual Results
Instead the parallel threads rapidly grow in size until they take over all 24GB of memory on my machine.
Versions
Windows-7-6.1.7601-SP1
('Python', '2.7.11 |Anaconda 2.5.0 (64 bit)| (default, Jan 29 2016, 14:26:21) [MSC v.1500 64 bit (AMD64)]')
('NumPy', '1.11.1')
('SciPy','0.17.0')
('Scikit-Learn','0.18.1')
The text was updated successfully, but these errors were encountered: