8000 [MRG] Accelerate example plot_kernel_ridge_regression.py by lisacsn · Pull Request #21794 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG] Accelerate example plot_kernel_ridge_regression.py #21794

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

lisacsn
Copy link
Contributor
@lisacsn lisacsn commented Nov 26, 2021

Reference Issues/PRs

References #21598

What does this implement/fix? Explain your changes.

Speed up ../examples/miscellaneous/plot_kernel_ridge_regression.py by reducing the number of samples from 10000 to 7000 for X, and 100000 to 70000 for X_plot.

Output before the changes:

And after:

Any other comments?

/

@ogrisel
Copy link
Member
ogrisel commented Nov 26, 2021

The text of the analysis of the prediction time is wrong. Currently it reads:

However, prediction of 100000 target values is more than tree times faster with SVR since it has learned a sparse model using only approx. 1/3 of the 100 training datapoints as support vectors.

But both on the main branch with 100k samples and in your PR with 70k samples, the KRR model predicts faster. So we could fix the text to be something like:

The speed of prediction of SVR could in theory be 3x faster than KRR because SVR uses approximately 1/3 of the 100 training datapoints as support vectors. However here we observe that this not the case, probably because of implementations details (the SVR prediction code does not seem to be as well optimized as the KRR prediction code).

and reduce the prediction set to 10k samples instead (large enough to measure a timing that is not too noisy but small enough to make this example run significantly faster).

@lisacsn lisacsn force-pushed the accelerate_kernel_ridge_regression branch from 15f3d96 to e863ce1 Compare November 28, 2021 10:19
@lisacsn
Copy link
Contributor Author
lisacsn commented Nov 28, 2021

Thank you for your comments.

If we reduce the prediction set to 10k samples we now have this outputs:

SVR complexity and bandwidth selected and model fitted in 0.801 s
KRR complexity and bandwidth selected and model fitted in 0.428 s
Support vector ratio: 0.290
SVR prediction for 10000 inputs in 0.029 s
KRR prediction for 10000 inputs in 0.069 s

@adrinjalali adrinjalali mentioned this pull request Nov 29, 2021
41 tasks
@adrinjalali adrinjalali changed the title [MRG] Accelerate example plot_kernel_ridge_regression [MRG] Accelerate example plot_kernel_ridge_regression.py Nov 29, 2021
Copy link
Member
@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@@ -125,7 +127,7 @@
plt.figure()

# Generate sample data
X = 5 * rng.rand(10000, 1)
X = 5 * rng.rand(7000, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The final point on the graph is 10**4 because of:

sizes = np.logspace(1, 4, 7).astype(int)

below. If we want the final point to end with 10**4, then I think we need to keep this at 10000.

@cmarmo
Copy link
Contributor
cmarmo commented Aug 2, 2022

plot_kernel_ridge_regression.py has been accelerated in #21791.
I'm closing this pull request.

@cmarmo cmarmo closed this Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0