ENH Array API support for PCA #26315

mtsokol · 2023-05-01T22:13:07Z

Reference Issues/PRs

Based on #25956

What does this implement/fix? Explain your changes.

This PR adds PyTorch support (via array_api_compat) for PCA. It changes heavy operations (i.e. svd) to use proper backend, based on passed array type. Also a unit test is added to assert that PyTorch output is the same as NumPy one.

Solvers support:

	NumPy	PyTorch
full	yes	yes
randomized	yes	refactor required to adjust `randomized` to array_api
arpack	yes	no

The arpack solver uses svds which is not supported by PyTorch (the closest method that I found was torch.svd_lowrank that is meant to be used for sparse matrices, but it only computes an approximation). Should I just throw an exception with proper description? (that arpack for PyTorch tensors is not supported)

Similar case occurs for randomized solver. The randomized_svd is using lu method for decomposition, whose API differs between SciPy and PyTorch (here, the implementation uses permute_l=True parameter that returns PL, U instead of P, L, U. The PyTorch implementation doesn't support this parameter. Moreover Numpy.linalg does not provide lu). Therefore supporting randomized would require explicitly checking if PyTorch backend is present.

Any other comments?

Please share your feedback!

TODO

Improve test coverage for error messages for the unsupported cases
Manually run the cupy and torch tests on a machine with cuda
Measure performance impact
- on CPU with torch
- on GPU with torch
- on GPU with cupy
Rework the tooling to make it possible to test estimator specific methods (e.g. the get_covariance / get_precision methods)

Benchmark results

Data shape (500000, 1000) and dtype float32 of size 2000.0 MB
PCA(n_components=5, svd_solver='randomized', power_iteration_normalizer='QR')
Fitting PCA(n_components=5) with numpy took 42.591s
Fitting PCA(n_components=5) with numpy and n_threads=1 took 18.953s
Fitting PCA(n_components=5) with numpy and n_threads=4 took 44.034s
Fitting PCA(n_components=5) with torch on CPU took 4.163s
Fitting PCA(n_components=5) with torch on GPU took 0.888s
Fitting PCA(n_components=5) with cupy on GPU took 0.934s

=> numpy with MKL has a thread-related performance problem with float32 data!

EDIT: I tried with OpenBLAS and the numpy code runs in 6 to 8s (not exactly the same machine though). So there is definitely a problem between numpy and MKL on float32 data for this workload.

Data shape (500000, 1000) and dtype float64 of size 4000.0 MB
PCA(n_components=5, svd_solver='randomized', power_iteration_normalizer='QR')
Fitting PCA(n_components=5) with numpy took 6.847s
Fitting PCA(n_components=5) with numpy and n_threads=1 took 31.415s
Fitting PCA(n_components=5) with numpy and n_threads=4 took 12.627s
Fitting PCA(n_components=5) with torch on CPU took 4.229s
Fitting PCA(n_components=5) with torch on GPU took 0.912s
Fitting PCA(n_components=5) with cupy on GPU took 0.412s

Data shape (500000, 1000) and dtype float32 of size 2000.0 MB
PCA(n_components=5, svd_solver='full')
Fitting PCA(n_components=5) with numpy took 24.863s
Fitting PCA(n_components=5) with torch on CPU took 8.832s
Fitting PCA(n_components=5) with torch on GPU took 1.513s
Fitting PCA(n_components=5) with cupy on GPU took 4.109s
Fitting PCA(n_components=5) with cupy with cuML on GPU took 0.683s

Environment:

[{'filepath': '/data/parietal/store3/work/ogrisel/mambaforge/envs/py310/lib/libomp.so',
  'internal_api': 'openmp',
  'num_threads': 48,
  'prefix': 'libomp',
  'user_api': 'openmp',
  'version': None},
 {'filepath': '/data/parietal/store3/work/ogrisel/mambaforge/envs/py310/lib/libmkl_rt.so.2',
  'internal_api': 'mkl',
  'num_threads': 48,
  'prefix': 'libmkl_rt',
  'threading_layer': 'gnu',
  'user_api': 'blas',
  'version': '2022.1-Product'}]

This machine has a 48 physical core CPU and a NVIDIA A100 GPU.

Benchmark script:

https://gist.github.com/ogrisel/44bae6f8988abacae358c047d3ecc147

mtsokol · 2023-05-03T12:14:34Z

One more question regarding np.flat method. In the get_covariance method the np.flat is used to modify array's diagonal. PyTorch offers torch.view instead for that purpose, but Numpy's view is completely different (only changing dtype).

I used xp.reshape, as it conforms to both libraries to access the diagonal, but it might cause array copying (stated explicitly in docs).

Should I use np.flat and torch.view and check array type backend explicitly in code? (But it's not desired approach). Or do you know another way for accessing array's diagonal?

ogrisel

Thanks for the PR, here is a first batch of feedback.

Out of curiosity did you run some benchmarks both on a CPU and on a GPU?

Also please avoid assuming that get_namespace will only be used to add torch support but let's strive to add generic Array API support as much as possible by default. In particular, it would be great to make this PR generic enough to also work with CuPy (and tested accordingly).

I used xp.reshape, as it conforms to both libraries to access the diagonal, but it might cause array copying (stated explicitly in docs).

Should I use np.flat and torch.view and check array type backend explicitly in code? (But it's not desired approach). Or do you know another way for accessing array's diagonal?

It's probably acceptable to have namespace specific code paths for performance reasons with a fallback to the default xp.reshape. However we should try to factorize those namespace specific hacks in a common private module reused by all scikit-learn estimators that may need it.

sklearn/decomposition/_pca.py

sklearn/decomposition/tests/test_pca.py

ogrisel · 2023-05-05T12:42:15Z

/cc @thomasjpfan who might be able to provide more precise guidance.

ogrisel · 2023-05-05T12:45:21Z

Also @mtsokol be aware that there are related concurrent discussions in other issues that might be of interest for this PR. I tried to label all those issues and PRs with the "Array API" label:

https://github.com/scikit-learn/scikit-learn/issues?q=label%3A%22Array+API%22+

Since Array API support is still quite new and experimental we should try to coordinate between the work of different PRs to make consistent design decisions across estimators.

mtsokol · 2023-05-05T13:01:50Z

Hi @ogrisel,

Thank you for your feedback! I think all comments are clear to me, let me follow-up on connected PRs/work to avoid repeating changes, then I will continue this PR. I think a separate module to store all tensor-library specific hacks should help here.

Unfortunately, I don't have access to GPU, but I will perform CPU benchmark comparison across libraries.

ogrisel · 2023-05-05T13:12:08Z

About the np.flat, maybe we you can implement a private _flat helper as done for _nanmin in #26243.

thomasjpfan

Thank you for the PR @mtsokol !

I recommend testing with numpy.array_api to catch places where the Array API specification is not being followed. Here is an example:

scikit-learn/sklearn/tests/test_discriminant_analysis.py

Lines 680 to 684 in fa5f43f

    
           @skip_if_array_api_compat_not_configured 
        
           @pytest.mark.parametrize("array_namespace", ["numpy.array_api", "cupy.array_api"]) 
        
           def test_lda_array_api(array_namespace): 
        
               """Check that the array_api Array gives the same results as ndarrays.""" 
        
               xp = pytest.importorskip(array_namespace)

sklearn/decomposition/_base.py

sklearn/decomposition/_pca.py

sklearn/decomposition/_base.py

betatim · 2023-05-08T09:23:02Z

I created #26348 to discuss the topic of common tests for Array API suport, hopefully we can converge on something and make a separate PR for this topic.

ogrisel · 2023-06-14T09:38:49Z

@mtsokol #26372 was merged in main. Feel free to update this PR accordingly: in particular there is a new estimator tag to add to the estimator to have the new common tests run on it.

Also let us know if you want someone else to takeover the PR to address all the comments in the above discussion if you don't have the time to do it yourself.

ogrisel · 2023-06-16T12:22:03Z

For reference, I commented on an older thread about the randomized svd case here: #26315 (comment)

mtsokol · 2023-06-16T18:03:37Z

@mtsokol #26372 was merged in main. Feel free to update this PR accordingly: in particular there is a new estimator tag to add to the estimator to have the new common tests run on it.

Also let us know if you want someone else to takeover the PR to address all the comments in the above discussion if you don't have the time to do it yourself.

Hi @ogrisel,
I updated the PR, so array_api compatibility is tested in test_check_estimator_clones instead of a dedicated pca test. Right now it's still the randomized solver to be adapted to array_api (for example sklearn.utils.extmath.py line 808, uses advanced integer list indexing that is not allowed by array_api standard).

If it's Ok, maybe someone can take this PR from here? I'm sorry for not communicating that earlier.

…_flip

sklearn/utils/_array_api.py

ogrisel · 2023-06-17T01:53:20Z

I tried to push a fix in the previous commit but apparently it's wrong (it broke many other tests). I will investigate later.

EDIT: should now be fixed in c84e4ef.

sklearn/decomposition/_base.py

sklearn/utils/extmath.py

…dition

sklearn/utils/extmath.py

…h raw numpy arrays without device attribute

ogrisel · 2023-07-12T07:59:20Z

@thomasjpfan actually I still need to protect against missing device attributes even when is_array_api_compliant is true because the of the xp.__namespace__ == "numpy" case (with the numpy wrapper).

I realized that we did not have tests for this case. So I enabled this case in e30bfa8 (and updated the code accordingly). I now have all 92 tests pass (for pytest -k array_api sklearn) on a cuda machine with all the soft dependencies.

ogrisel · 2023-07-12T08:03:59Z

Note that array_api_compat provides a device helper function that returns "cpu" for numpy arrays and array.device otherwise. We could do something similar instead of using getattr(array, "device", None) if you prefer.

betatim · 2023-07-12T12:36:37Z

Note that array_api_compat provides a device helper function that returns "cpu" for numpy arrays and array.device otherwise. We could do something similar instead of using getattr(array, "device", None) if you prefer.

Can't we use device from array_api_compat?

10000

ogrisel · 2023-07-12T20:24:24Z

Can't we use device from array_api_compat?

We would cannot import from array_api_compat at the top of the file (because we need to protect the import behind sklearn.get_config("array_api_dispatch") or equivalently is_array_api_compliant). Hence it would be too ugly/verbose to use array_api_compat.device with a protected lazy import each time.

thomasjpfan

I'm trying to push for us to not need to_device and use xp.asarray for device transfer. The rest of the PR looks good to go.

I ran the array_api test locally with all the dependencies and everything passes.

sklearn/utils/extmath.py

thomasjpfan

LGTM, thank you everyone for working on this!

betatim · 2023-07-14T11:41:27Z

Whoop whoop! Nice work everyone on yet another PR that turns out to involve a lot more work than you originally thought :D

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Tim Head <betatim@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

ENH Adds PyTorch support for PCA

dd4c9fc

github-actions bot added module:decomposition module:utils labels May 1, 2023

ENH Support get_precision and get_covariance

ceb10e3

mtsokol force-pushed the feature/array_api_compat_pca branch from 44d5d75 to ceb10e3 Compare May 3, 2023 11:43

mtsokol marked this pull request as ready for review May 3, 2023 11:57

ogrisel added the Array API label May 4, 2023

ogrisel reviewed May 5, 2023

View reviewed changes

sklearn/decomposition/_pca.py Outdated Show resolved Hide resolved

sklearn/decomposition/tests/test_pca.py Outdated Show resolved Hide resolved

thomasjpfan reviewed May 5, 2023

View reviewed changes

sklearn/decomposition/_base.py Outdated Show resolved Hide resolved

asmeurer reviewed May 5, 2023

View reviewed changes

sklearn/decomposition/_pca.py Outdated Show resolved Hide resolved

asmeurer reviewed May 5, 2023

View reviewed changes

sklearn/decomposition/_base.py Outdated Show resolved Hide resolved

This was referenced May 8, 2023

Add common tests for estimators that support the Array API #26348

Closed

ENH Add common Array API tests and estimator tag #26372

Merged

ogrisel changed the title ~~ENH Adds PyTorch support for PCA~~ ENH Array API support for PCA May 17, 2023

ogrisel mentioned this pull request May 17, 2023

Support for xp.linalg.lu (not yet part of the Array API spec) data-apis/array-api-compat#45

Closed

Merge branch 'main' into feature/array_api_compat_pca

0b8592c

ENH apply review comments

2ae83c0

Fix multi-fancy indexing via using xp.take on flattened arrays in svd…

1b4a7cd

…_flip

ogrisel reviewed Jun 16, 2023

View reviewed changes

sklearn/utils/_array_api.py Outdated Show resolved Hide resolved

Unit test for svd_flip

cf86c45

ogrisel added 3 commits July 10, 2023 10:40

Remove changelog merge typo

47091b8

Keep on using scipy.linalg.inv in PCA.get_covariance by default

cb5c03f

Add not on combined dtype conversion and device move

4093a6c

ogrisel reviewed Jul 10, 2023

View reviewed changes

sklearn/decomposition/_base.py Outdated Show resolved Hide resolved

ogrisel added 2 commits July 10, 2023 10:58

Spare one more temporary allocation.

2216276

Merge branch 'main' into feature/array_api_compat_pca

5032d1f

thomasjpfan reviewed Jul 10, 2023

View reviewed changes

sklearn/utils/extmath.py Outdated Show resolved Hide resolved

Assume to_device is always called under an is_array_api_compliant con…

5abc2bf

…dition

ogrisel reviewed Jul 10, 2023

View reviewed changes

sklearn/utils/extmath.py Outdated Show resolved Hide resolved

ogrisel added 2 commits July 11, 2023 13:11

Extend the common tests to handle the case when array api is used wit…

e30bfa8

…h raw numpy arrays without device attribute

Extend the common tests to handle the case when array api is used wit…

bb564bb

…h raw numpy arrays without device attribute

ogrisel added 2 commits July 12, 2023 10:29

Simplify condition to protect scipy.linalg.svd in randomized_svd

3e28c50

Merge branch 'main' into feature/array_api_compat_pca

ea4fc2e

thomasjpfan reviewed Jul 12, 2023

View reviewed changes

sklearn/utils/extmath.py Outdated Show resolved Hide resolved

ogrisel added 2 commits July 13, 2023 10:31

Use xp.asarray + device instead of a new to_device helper

b881ff1

Fix randomized_range_finder with sparse matrices

aa9a33a

thomasjpfan approved these changes Jul 13, 2023

View reviewed changes

thomasjpfan merged commit 702316c into scikit-learn:main Jul 13, 2023

mtsokol deleted the feature/array_api_compat_pca branch July 14, 2023 08:18

asmeurer mentioned this pull request Jul 19, 2023

ENH: array-api for sparse arrays (as much as possible) scipy/scipy#18915

Open

thomasjpfan mentioned this pull request Jul 25, 2023

MAINT Minor adjustments to array_api usage in PCA #26898

Closed

ogrisel mentioned this pull request Aug 19, 2024

Add array API support for Nystroem approximation #29661

Draft

4 tasks

tomwhite mentioned this pull request Nov 18, 2024

Use sgkit.distarray for PCA sgkit-dev/sgkit#1274

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH Array API support for PCA #26315

ENH Array API support for PCA #26315

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

	@skip_if_array_api_compat_not_configured
	@pytest.mark.parametrize("array_namespace", ["numpy.array_api", "cupy.array_api"])
	def test_lda_array_api(array_namespace):
	"""Check that the array_api Array gives the same results as ndarrays."""
	xp = pytest.importorskip(array_namespace)

Uh oh!

ENH Array API support for PCA #26315

ENH Array API support for PCA #26315

Uh oh!

Conversation

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

TODO

Benchmark results

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!