-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
DOC fix random_state in several example for reproducibility #27153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC fix random_state in several example for reproducibility #27153
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM.
noisy_moons = datasets.make_moons(n_samples=n_samples, noise=0.05) | ||
blobs = datasets.make_blobs(n_samples=n_samples, random_state=8) | ||
no_structure = np.random.rand(n_samples, 2), None | ||
seed = 170 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's a number, I'd just use the number in multiple places rather than having a variable (and I'd use 42 to feel better about it 😁 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried 42 immediately, but it has a radically different visual output then (on the image below). You are right about the number vs. variable, I just package it like this for these seed explorations, to get a similar output as the original as discussed in #26976. I'll revert them when I know the number again. 170 was originally used, so I thought it's better to keep it as the easiest solution. What do you think when looking at the image below compared to the original and the one with 170 as the seed value?

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, ok then, we can keep the 170.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, variable removed in 5e4f187, 170 kept as the number. It is a tricky balance with the seed values it seems :)
Thanks @TamaraAtanasoska LGTM |
Reference Issues/PRs
Fixes a part of #17568.
What does this implement/fix? Explain your changes.
This PR introduces minor changes in three files:
examples/cluster/plot_linkage_comparison.py
examples/preprocessing/plot_all_scaling.py
examples/preprocessing/plot_discretization_classification.py
In the later two files, just one algorithm in each was missing a
random_state
parameter. The changes are minorAny other comments?
An updated task list of images/files to address is found at the bottom of #17568, see: #17568 (comment). Some files are newly marked as done, but they aren't part of this PR. This is because the
random_state
was already implemented in all the relevant places.@glemaitre @adrinjalali please take a look 👋