8000 Float32 support factor analysis by thibsej · Pull Request #13303 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Float32 support factor analysis #13303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

thibsej
Copy link
Contributor
@thibsej thibsej commented Feb 27, 2019

Reference Issues/PRs

this PR works on #11000 by preserving the dtype float32 in Factor Analysis.

cross reference: #8769 (comment)

What does this implement/fix? Explain your changes.

Any other comments?

clf_64.fit(X.astype(np.float64))

# Check value consistency between types
rtol = 1e-5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test breaks for rtol=1e-6, even though atol~1e-8 in the test. Should I change the numerical consistency criteria to atol=1e-6 ?

for method in ['randomized', 'lapack']:
clf = FactorAnalysis(n_components=n_components, svd_method=method)
clf.fit(X.astype(data_type))
assert clf.components_.dtype == expected_type
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dtype is tested only for components_, but should any numpy attribute preserve the dtype float32 ? If so, should the test be modified ?



@pytest.mark.parametrize("method", ['lapack', 'randomized'])
def test_lda_numeric_consistency_float32_float64(method):
Copy link
Contributor Author
@thibsej thibsej Feb 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering #13309 I tried running SVD in np.float64 by doing in factor_analysis.py at line216

s, V, unexp_var = my_svd((X / (sqrt_psi * nsqrt)).astype(np.float64))
s, V, unexp_var = s.astype(X.dtype), V.astype(X.dtype), unexp_var.astype(X.dtype)

Instead of

s, V, unexp_var = my_svd((X / (sqrt_psi * nsqrt)))

But it does not pass the numerical consistency test. There might other sources of numerical stability between types.

@thomasjpfan
Copy link
Member

I am closing this PR because it has been superseded by #24321. Thank you for working on this issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:decomposition Superseded PR has been replace by a newer PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0