-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ENH Add eigh
solver to FastICA
#22527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Implementation was changed to account for the fact that now `X` is transposed and accessed via `XT`
Doing some more thorough benchmarks right now. Current insights:
In the end, I think it's fine to support
8000
this, but the default should definitely be To reproduce the benchmarks yourself, install Raw markdown table (100 rows) of runtime ratios (sorted)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's best to include the raw numbers in a gist, so others can look and make graphs themselves to analyze the results. You can also include benchmark scripts in a gist, etc.
runtime ratios (sorted)
Assuming the ratio is eigh runtime / svd runtime, I think absolute values matter here. 0.1 sec vs 0.2 sec is very different from 10 secs vs 20 seconds.
What does the runtime performance look like for eigh when svd spends >= 1 seconds for the same data?
sklearn/decomposition/_fastica.py
Outdated
@@ -547,9 +561,22 @@ def g(x, fun_args): | |||
XT -= X_mean[:, np.newaxis] | |||
|
|||
# Whitening and preprocessing by PCA | |||
u, d, _ = linalg.svd(XT, full_matrices=False, check_finite=False) | |||
if self.svd_solver == "eigh": | |||
D, u = linalg.eigh(X.T.dot(X)) # Faster when n < p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overhead of eigh
comes from the matrix multiply and calling the eigh
solver. From your benchmarks, it may only be better for n_samples >> n_features.
Also there should be memory overhead for storing the result of the matrix multiply.
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…into change_svd
Here's the gist for the benchmarks -- it includes the generating script and CSV of the results.
Info on runs where svd time >=1 second
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the benchmarks, it looks like the only advantage of eigh is for n_samples >> n_features
. For future reference, it's good to do order of magnetite jumps when bench marking. AKA 100, 1_000, 10_000, etc. Comparing 10 and 20 is not as interesting.
What is the memory usage like for (10000, 1000)
between the solvers?
I tried to stay consistent with how it's calculated in More coarse benchmark results can be found here |
eigh
solver to FastICA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failures look related
sklearn/decomposition/_fastica.py
Outdated
# Give consistent eigenvectors for both svd solvers | ||
u *= np.sign(u[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it seem a bit too much to give control to this flipping through a parameter?
This is feels like a major change to me, so I am okay with a parameter to control this.
Although, the sign flipping logic is a little different from svd_flip. This implementation forces the first row to the positive, which works for all seeds in the test_fastica_simple_different_solvers
test. We tried the sign flipping logic in svd_flip, but it did not work for some seeds in test_fastica_simple_different_solvers
.
@Micky774 Can you investigate why svd_flip like logic did not work? (I have a snippet here: #22527 (comment))
OK let's go for adding a new parameter. We can always revisit this choice in the future if we wish to change the behaviour and deprecate this parameter. |
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
…into change_svd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM.
LGTM. Thanks @Micky774 |
Reference Issues/PRs
Picks up stalled PR #11860
What does this implement/fix? Explain your changes.
PR #11860: Provides an alternative implementation that avoids SVD's extraneous calculations. Especially effective when
num_samples >> num_features
This PR:
fastica
Any other comments?
Ongoing problems/consideration:
eigh
8000 solver is failingNeeds follow-up PRs for adding
whiten_solver="auto"
and beginning deprecation to mark it as default.May need a follow-up PR for changing the default value of
sign_flip
toTrue
.