10000 ENH Improve speed plot_adaboost_multiclass.py · Pull Request #21651 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

ENH Improve speed plot_adaboost_multiclass.py #21651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from Nov 16, 2021
Merged

ENH Improve speed plot_adaboost_multiclass.py #21651

merged 2 commits into from Nov 16, 2021

Conversation

ghost
Copy link
@ghost ghost commented Nov 13, 2021

#21598 @TomDLT @adrinjalali

Tuned down the estimator number a bit, also decreased the number of splits.
adaboost_before
adaboost_after

@adrinjalali adrinjalali changed the title Changed the estimator numbers to make example quicker in execution Changed the estimator numbers to make plot_adaboost_multiclass.py quicker in execution Nov 13, 2021
@adrinjalali adrinjalali mentioned this pull request Nov 13, 2021
41 tasks
Copy link
Member
@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this reproducing the original chart now (i.e. Figure 1 of Zhu et al)? Please check what the text refers to, and what the original chart is supposed to be like, and adapt the text accordingly if necessary.

Comment on lines 40 to 43
n_samples=13000, n_features=10, n_classes=3, random_state=1
)

n_split = 3000
n_split = 2000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not reducing the number of splits, it's the size of the training set. You can instead make the size of the training set also smaller, and use train_test_split to get like 50% for training.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this then have an impact on the execution time if it's just the relation train to test?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it back from 2000 to the original 3000. Even with only the n_estimators parameters changed, time reduction is from 16 sec to 7 sec

@adrinjalali
Copy link
Member

Is it still reproducing the aforementioned figure from the original article? In terms of required input for instance.

@ghost
Copy link
Author
ghost commented Nov 15, 2021

@adrinjalali Yes, still looking extremely similar
bf
af

@TomDLT TomDLT changed the title Changed the estimator numbers to make plot_adaboost_multiclass.py quicker in execution ENH Improve speed plot_adaboost_multiclass.py Nov 16, 2021
@TomDLT TomDLT merged commit 90a202d into scikit-learn:main Nov 16, 2021
@TomDLT
Copy link
Member
TomDLT commented Nov 16, 2021

Thanks @sveneschlbeck !

@ghost ghost deleted the speed_increased_example_adaboost branch November 16, 2021 17:33
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 22, 2021
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 29, 2021
samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Dec 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0