-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
ENH: linalg: implement higher-order SVD (HOSVD) for M-D tensors #14284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
0145b15
to
f6f1b10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be a great addition, thanks!
I wonder if it would not be better to just have it under SVD
itself so that we just have one function. We would just have to check the dimension and redirect to the corresponding private method.
The problem is that numpy has already defined for its svd method what happens if the argument has dimension > 2, cf. [1]. I would not suggest to implement a different behavior in scipy to avoid confusion. [1] https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html |
I see, I will let the other SVD residents decide. Thanks for the info. |
+1 for a separate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a pass on the structure, it's a good start. I will let the other comment on the maths as I am not a specialist on the subject and not really a user either.
Also optional but nice to have, consider adding some typing. We slowly want to add these across the library. |
I would suggest a name like tensor_svd or something. It's too close to hankel sv's and also difficult to relate to the abbreviation SVD. |
After another coffee, to be perfectly honest, tensor word starts to sound like jargon after a while since NumPy takes pride in calling everything "ndarray', I'm not even sure I believe in mentioning tensors here. ML people use it to sound fancy but they really mean nDarrays (I'm looking at you tensorflow). Before anyone starts complaining that (a tensor) != (a matrix), I'd like to remind that also (a matrix) != (2d array of numbers) either if we are pedantic at that level before we go down that rabbit hole. Not to mention that this is strictly a numerical context. |
That's why I suggested to not talk about these at all and use the current |
This operation actually does treat the input as a proper tensor, I believe. |
That is nice, I really like the idea to have a typed library at some point :) About the function name: I have no strong opinion about this. Personally I would go with something like |
Do we actually? This is something I really want to be discussed extensively. So far it has been coming from NumPy side and a bit one-sided. I for one don't see the need for the extra complication. In fact for many users, not bothering with types is the reason why we started using Python and why so far it gained quite some traction. |
For the naming, I think the ship has sailed to do anything about tensor usage so probably I don't have any objection to with or without underscore to be honest. |
dd6f8c7
to
11ef523
Compare
Except for the discussion on the function name, I think I implemented all suggestions and would be fine from my side for another review :) |
@ilayn Yes we do, cf the last 2 community meetings minutes and the several recent PRs adding it. Also we have introduced mypy in the CI. The whole Python community is actually adding types all around (cf all the other big libraries and the last pools from SO, PyCharm and talks at PyCon). I agree that we should not complicate the story for the end users. But here, we add a bit of complexity on our side for so much convenience when a final user is using SciPy in a modern IDE. In PyCharm it's day and night. So to me, it's really worth the small trouble. I mean, we ask extensive validation on the inputs, adding types would be the easiest part of the whole job. We should have soon some guidelines (cc @BvB93) on how to properly add types. There are already common bricks in |
I don't agree. The python community is definitely not going towards the typing direction. There is a huge resistance from the regular users. The IDE argument doesn't make a strong argument as most scientific audience doesn't use an IDE. To be perfectly honest, to me this is pretty much a few core users advocating and adding without any discussion or nothing I am aware of. Other than complicating the signatures beyond recognition I have absolutely no use case for types in Python and mind you I am working with professionals. For most of them, absence of types is what makes them come over to Python from other languages. The type annotations are at most wishful arguments for a bit sophisticated functions and callbacks and primarily geared towards IDE users who wants better coding experience. If you folks want to go ahead without discussion fine but at least say it so, such that we don't get surprised by all the typing. Adding mypy is one thing, converting all function signatures to ones with type annotations is another. I don't need to go David Baezley all the way, but if you want typed language benefits you should code in one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JanLuca When adding tests I found that full_tensor
did not typically (i.e. almost never) have an effect. Was the intent for full_tensor=False
to implement the compact HOSVD? If so, the full_matrices
argument of svd
does not do what you wanted it to, I think.
Assuming so, I've replaced argument full_tensor=True
with compact=False
. I've tried to adjust the documentation, implementation, and tests accordingly; however, I admit that I have been doing this according to linear algebra intuition for how I think it should work without very careful research. For instance, I'm assuming the idea of multilinear rank corresponds with the number of non-zero singular values you get in each of the SVDs. Can you check the changes?
scipy/linalg/_decomp_svd.py
<
A3E2
span title="Label: Outdated" data-view-component="true" class="Label Label--warning">
Outdated
# todo: replace this with appropriate tolerance for vanishing singular values | ||
tol = 1e-12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ilayn Can you remind me of the preferred recipe for determining whether a singular value is zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of numerical experience vanishing singular values are typical in the area of 1e-15 times the largest singular value. But I am not sure if the SVD methods have a more deterministic way to determine that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there is precedent somewhere that I wanted to follow.
with pytest.raises(ValueError, match=message): | ||
higher_order_svd(A, check_finite=True) | ||
|
||
@pytest.mark.parametrize('dtype', DTYPES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, meant to actually use dtype here.
There is quite a bit of discussion above so please don't just go ahead and merge all open PRs. This one is actually nDarray version of SVD so it is not nearly ready to be included in linalg. |
? All I've done so far is to review it, improve the tests, and start correcting an apparent bug. I pinged you because I thought it needed your input before merging. It did look like it was close given the maintainer approvals, lack of unresolved issues, and no other problems mentioned in the past few years, but I found some new things.
Yup, read it. It looked to me like the bulk of the comments are about whether we should add typing information in SciPy. That probably shouldn't hold up this PR, since it's a much bigger discussion that didn't really fit here. There was also a conversation about the function name, but that appears to have settled down. An email went out to the mailing list, you and Robert seemed to support having such a function, and Pamphile actually approved the PR. It seemed to be the typing controversy, rather than a technical or interface problem related to the PR, that stopped this from merging already. Tyler has been moving the milestone for the past three years, so I thought it would be worth working on.
No, it's not ready to merge, but I don't understand how being an nDarray version of SVD necessarily leads to the conclusion that it's not ready. Doesn't it at least partly depend on the quality of the PR rather than being inherent to the subject of the PR? Or do you mean that something about the subject makes the standards much higher than usual? I don't see any remaining technical or interface reservations above besides the ones I had. If there is a new problem, please explain. |
@JanLuca can you reply to #14284 (review) and review the changes? Thanks! |
@JanLuca are you interested in returning to this PR? Branching for the next release is planned for December 6th. |
@lucascolley Thanks for the ping. I am sorry that I missed the work since my last comment and forgot to get on this myself. @mdhaber Thanks for the wonderful improvement of the PR. Looks very good to me and I would be fine with merging it! |
Thanks @JanLuca. Can you reply to the first part of #14284 (review) about |
You are right about that. I intended to implement something like the compact SVD but in this case, the |
Thanks @JanLuca. We just need to figure out the details here, though.
|
I wrote up my math argument why the compact version is valid the way you implemented it: I have to recheck the HOSVD paper, will do so later. But I would prefer your idea that the user has to supply a (relative) tolerance to cut off the singular values. I can easily imagine different settings where different thresholds are needed. On a different note: I just realized that it may be better in many cases to construct |
Ok, I checked the paper and did not find some indication for the tolerance calculation. So I would still vote for a user supplied one. About the |
I'd suggest getting this in as-is (with a tolerance to define compact SVD) and submitting a follow-up PR with any improvements. I think I can add that shortly. @lucascolley were you interested in reviewing, or just checking in? |
Yea that works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't interrogated the maths closely, but looks good to me!
@@ -299,6 +303,101 @@ def diagsvd(s, M, N): | |||
raise ValueError("Length of s must be M or N.") | |||
|
|||
|
|||
def higher_order_svd( | |||
a: npt.ArrayLike, *, compact_rtol: float | None = None, check_finite: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: perhaps more consistent with the referenced paper to make this an upper case A?
a: npt.ArrayLike, *, compact_rtol: float | None = None, check_finite: bool = True | |
A: npt.ArrayLike, *, compact_rtol: float | None = None, check_finite: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the docs should then also be updated accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the same style as in the normal svd
routine to keep it systematic.
---------- | ||
a : array_like | ||
Tensor to decompose | ||
compact_rtol : float, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to stick with compact_rtol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine for me, too! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a compromise with #14284 (comment), what do you think of renaming this to just rtol
? Note that pinv
and pinvh
use atol
and rtol
for the threshold for significance of singular values, so maybe it is better to just roll with it rather than trying to be more specific. That said, orth
and null_space
, which are also closely related, use rcond
. Perhaps we just call this rtol
now, and after this is merged, open an issue about standardizing the name? We can use a decorator that will make it easy to change the name of rcond
in a backward compatible way, if that's what we decide.
#14284 (comment) also gives the answer to my question from August. So one possibility is that we leave rtol
required for now, then open an enhancement issue about adding a compact
or similar argument: when compact=True
, rtol
would no longer be required, and the default could be computed according to the suggestion. We could add atol
at that time, too. (The other option is doing both of these things now.)
With that in mind, new thoughts?
Compact typically refers to incomplete basis for rectangular SVD but here you are truncating so probably needs a different name. We should also use the standard atol rtol keywords since we know the largest singular value. Then the default can be selected as 100max((a.shape))*eps for rtol and 0.0 for atol that will multiply the largest singular value. Other than that looks ok to me but I couldn't have a chance to look at it throughly yet due to travels. But you seem to be on top of it so I don't have any other remarks. |
Do folks mind if we merge this right after branching instead of before? There's definitely some momentum with recent changes and checking the primary literature here, but I think I'd slightly prefer to merge it right after we branch. If you're really confident enough to merge you can restore the original milestone. Branching is imminent though, a few hours away I think. |
After branch is fine with me. I won't be merging it; I've written too much. I am willing to make remaining changes if we can agree on the "compact" behavior. |
|
||
Examples | ||
-------- | ||
>>> import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All examples now require this import
""" | ||
a = _asarray_validated(a, check_finite=check_finite) | ||
if compact_rtol is not None and compact_rtol < 0: | ||
raise ValueError("`compact_rtol` must be a positive floating point number.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error says compact_rtol
must be positive, but the if condition only checks that it is non-negative
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-reading the comments, my impression is that
- there's a general agreement that this is a good addition
- there's an RM request to merge this just after the release split not just before it (the request was made before the previous release split but I suppose the spirit of the request fast-forwards)
- there is a request to rename
compat_rtol
to justrtol
(#14284 (comment)) and a suggestion for a path forward after the rename (#14284 (comment))
Does this sound like a right summary? If so, I suggest that we
- rebase to solve conflicts
- optionally address a couple of small suggestions below
- press the green button right after scipy 1.16 branches off main
- iterate on suggested improvements on
compact=True
?
"""Higher-order SVD (HOSVD) | ||
|
||
Factorizes the M-D tensor `a` into a list of M matrices ``U_k`` containing | ||
the left singular vectors of the unfolded tensor with respect to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have to admit I don't know off the the cuff what an unfolded tensor is.
Googling around gives https://jeankossaifi.com/blog/unfolding.html (which references "Tensor Decompositions and Applications"in SIAM REVIEW, 2009)
) and https://docs.pytorch.org/docs/stable/generated/torch.nn.Unfold.html
Are these the same as here? If so, it'd be best to add a reference and give a definition, if possible. My first knee-jerk reaction was can we have a math-y definition in the Einstein tensor notation. If that's difficult, can we please have some explanation, a code example, and a link to the literature.
"A Multilinear Singular Value Decomposition" | ||
SIAM Journal on Matrix Analysis and Applications, | ||
21 (4), pp. 1253-1278. ISSN 1095-7162 | ||
https://doi.org/10.1137/S0895479896305696 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, trivial:
https://doi.org/10.1137/S0895479896305696 | |
:doi:`10.1137/S0895479896305696` |
LinAlgError | ||
If SVD computation does not converge. | ||
|
||
.. versionadded:: 1.15.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.. versionadded:: 1.15.0 | |
.. versionadded:: 1.17.0 |
Reference issue
Since the pull request #12592 is open for quite some time due to license questions, I took the time to reimplement the HOSVD algorithm myself so it can be included in scipy.
What does this implement/fix?
Based on the publication by Lieven De Lathauwer et al. [1] the higher-order SVD is implemented.
[1] https://doi.org/10.1137/S0895479896305696