8000 ENH: linalg: implement higher-order SVD (HOSVD) for M-D tensors by JanLuca · Pull Request #14284 · scipy/scipy · GitHub
[go: up one dir, main page]

Skip to content

ENH: linalg: implement higher-order SVD (HOSVD) for M-D tensors #14284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

JanLuca
Copy link
@JanLuca JanLuca commented Jun 23, 2021

Reference issue

Since the pull request #12592 is open for quite some time due to license questions, I took the time to reimplement the HOSVD algorithm myself so it can be included in scipy.

What does this implement/fix?

Based on the publication by Lieven De Lathauwer et al. [1] the higher-order SVD is implemented.

[1] https://doi.org/10.1137/S0895479896305696

@JanLuca JanLuca requested review from ilayn and larsoner as code owners June 23, 2021 13:49
@JanLuca JanLuca marked this pull request as draft June 23, 2021 14:08
@JanLuca JanLuca force-pushed the hosvd branch 2 times, most recently from 0145b15 to f6f1b10 Compare June 23, 2021 15:31
@JanLuca JanLuca marked this pull request as ready for review June 23, 2021 15:32
@JanLuca JanLuca changed the title WIP: Implement higher-order SVD (HOSVD) for M-D tensors Implement higher-order SVD (HOSVD) for M-D tensors Jun 23, 2021
Copy link
Member
@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be a great addition, thanks!

I wonder if it would not be better to just have it under SVD itself so that we just have one function. We would just have to check the dimension and redirect to the corresponding private method.

@JanLuca
Copy link
Author
JanLuca commented Jun 23, 2021

The problem is that numpy has already defined for its svd method what happens if the argument has dimension > 2, cf. [1]. I would not suggest to implement a different behavior in scipy to avoid confusion.

[1] https://numpy.org/doc/stable/reference/generated/numpy.linalg.svd.html

@tupui
Copy link
Member
tupui commented Jun 23, 2021

I see, I will let the other SVD residents decide. Thanks for the info.

@tupui tupui added scipy.linalg enhancement A new feature or improvement labels Jun 23, 2021
@rkern
Copy link
Member
rkern commented Jun 23, 2021

+1 for a separate hosvd() function.

Copy link
Member
@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a pass on the structure, it's a good start. I will let the other comment on the maths as I am not a specialist on the subject and not really a user either.

@tupui
Copy link
Member
tupui commented Jun 23, 2021

Also optional but nice to have, consider adding some typing. We slowly want to add these across the library.

@ilayn
Copy link
Member
ilayn commented Jun 24, 2021

I would suggest a name like tensor_svd or something. It's too close to hankel sv's and also difficult to relate to the abbreviation SVD.

@ilayn
Copy link
Member
ilayn commented Jun 24, 2021

After another coffee, to be perfectly honest, tensor word starts to sound like jargon after a while since NumPy takes pride in calling everything "ndarray', I'm not even sure I believe in mentioning tensors here. ML people use it to sound fancy but they really mean nDarrays (I'm looking at you tensorflow).

Before anyone starts complaining that (a tensor) != (a matrix), I'd like to remind that also (a matrix) != (2d array of numbers) either if we are pedantic at that level before we go down that rabbit hole. Not to mention that this is strictly a numerical context.

@tupui
Copy link
Member
tupui commented Jun 24, 2021

After another coffee, to be perfectly honest, tensor word starts to sound like jargon after a while since NumPy takes pride in calling everything "ndarray', I'm not even sure I believe in mentioning tensors here. ML people use it to sound fancy but they really mean nDarrays (I'm looking at you tensorflow).

Before anyone starts complaining that (a tensor) != (a matrix), I'd like to remind that also (a matrix) != (2d array of numbers) either if we are pedantic at that level before we go down that rabbit hole. Not to mention that this is strictly a numerical context.

That's why I suggested to not talk about these at all and use the current svd (and we could have a param to trigger the high order).
Otherwize, what about spelling it out: high_order_svd ? Because I am not a fan of mangling things like hosvd. Also, who knows what this shortening could mean in the future 😉

@rkern
Copy link
Member
rkern commented Jun 24, 2021

This operation actually does treat the input as a proper tensor, I believe.

@JanLuca
Copy link
Author
JanLuca commented Jun 25, 2021

Also optional but nice to have, consider adding some typing. We slowly want to add these across the library.

That is nice, I really like the idea to have a typed library at some point :)

About the function name: I have no strong opinion about this. Personally I would go with something like higher_order_svd.

@ilayn
Copy link
Member
ilayn commented Jun 25, 2021

Also optional but nice to have, consider adding some typing. We slowly want to add these across the library.

Do we actually? This is something I really want to be discussed extensively. So far it has been coming from NumPy side and a bit one-sided. I for one don't see the need for the extra complication. In fact for many users, not bothering with types is the reason why we started using Python and why so far it gained quite some traction.

@ilayn
Copy link
Member
ilayn commented Jun 25, 2021

For the naming, I think the ship has sailed to do anything about tensor usage so probably tensor_svd would be the most recognizable thing following the suite tensordot.

I don't have any objection to with or without underscore to be honest.

@JanLuca JanLuca force-pushed the hosvd branch 3 times, most recently from dd6f8c7 to 11ef523 Compare June 25, 2021 13:25
@JanLuca
Copy link
8000
Author
JanLuca commented Jun 25, 2021

Except for the discussion on the function name, I think I implemented all suggestions and would be fine from my side for another review :)

@tupui
Copy link
Member
tupui commented Jun 27, 2021

Also optional but nice to have, consider adding some typing. We slowly want to add these across the library.

Do we actually? This is something I really want to be discussed extensively. So far it has been coming from NumPy side and a bit one-sided. I for one don't see the need for the extra complication. In fact for many users, not bothering with types is the reason why we started using Python and why so far it gained quite some traction.

@ilayn Yes we do, cf the last 2 community meetings minutes and the several recent PRs adding it. Also we have introduced mypy in the CI.

The whole Python community is actually adding types all around (cf all the other big libraries and the last pools from SO, PyCharm and talks at PyCon). I agree that we should not complicate the story for the end users. But here, we add a bit of complexity on our side for so much convenience when a final user is using SciPy in a modern IDE. In PyCharm it's day and night. So to me, it's really worth the small trouble. I mean, we ask extensive validation on the inputs, adding types would be the easiest part of the whole job.

We should have soon some guidelines (cc @BvB93) on how to properly add types. There are already common bricks in scipy.lib._util.

@ilayn
Copy link
Member
ilayn commented Jun 27, 2021

I don't agree. The python community is definitely not going towards the typing direction. There is a huge resistance from the regular users. The IDE argument doesn't make a strong argument as most scientific audience doesn't use an IDE. To be perfectly honest, to me this is pretty much a few core users advocating and adding without any discussion or nothing I am aware of. Other than complicating the signatures beyond recognition I have absolutely no use case for types in Python and mind you I am working with professionals. For most of them, absence of types is what makes them come over to Python from other languages. The type annotations are at most wishful arguments for a bit sophisticated functions and callbacks and primarily geared towards IDE users who wants better coding experience. If you folks want to go ahead without discussion fine but at least say it so, such that we don't get surprised by all the typing.

Adding mypy is one thing, converting all function signatures to ones with type annotations is another. I don't need to go David Baezley all the way, but if you want typed language benefits you should code in one.

Copy link
Contributor
@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JanLuca When adding tests I found that full_tensor did not typically (i.e. almost never) have an effect. Was the intent for full_tensor=False to implement the compact HOSVD? If so, the full_matrices argument of svd does not do what you wanted it to, I think.

Assuming so, I've replaced argument full_tensor=True with compact=False. I've tried to adjust the documentation, implementation, and tests accordingly; however, I admit that I have been doing this according to linear algebra intuition for how I think it should work without very careful research. For instance, I'm assuming the idea of multilinear rank corresponds with the number of non-zero singular values you get in each of the SVDs. Can you check the changes?

scipy/linalg/_decomp_svd.py < A3E2 span title="Label: Outdated" data-view-component="true" class="Label Label--warning"> Outdated
Comment on lines 386 to 387
# todo: replace this with appropriate tolerance for vanishing singular values
tol = 1e-12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilayn Can you remind me of the preferred recipe for determining whether a singular value is zero?

Copy link
Author
@JanLuca JanLuca Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of numerical experience vanishing singular values are typical in the area of 1e-15 times the largest singular value. But I am not sure if the SVD methods have a more deterministic way to determine that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there is precedent somewhere that I wanted to follow.

with pytest.raises(ValueError, match=message):
higher_order_svd(A, check_finite=True)

@pytest.mark.parametrize('dtype', DTYPES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, meant to actually use dtype here.

@ilayn
Copy link
Member
ilayn commented Aug 28, 2024

There is quite a bit of discussion above so please don't just go ahead and merge all open PRs. This one is actually nDarray version of SVD so it is not nearly ready to be included in linalg.

@mdhaber
Copy link
Contributor
mdhaber commented Aug 28, 2024

so please don't just go ahead and merge all open PRs

? All I've done so far is to review it, improve the tests, and start correcting an apparent bug. I pinged you because I thought it needed your input before merging. It did look like it was close given the maintainer approvals, lack of unresolved issues, and no other problems mentioned in the past few years, but I found some new things.

There is quite a bit of discussion above

Yup, read it. It looked to me like the bulk of the comments are about whether we should add typing information in SciPy. That probably shouldn't hold up this PR, since it's a much bigger discussion that didn't really fit here. There was also a conversation about the function name, but that appears to have settled down. An email went out to the mailing list, you and Robert seemed to support having such a function, and Pamphile actually approved the PR.

It seemed to be the typing controversy, rather than a technical or interface problem related to the PR, that stopped this from merging already. Tyler has been moving the milestone for the past three years, so I thought it would be worth working on.

This one is actually nDarray version of SVD so it is not nearly ready to be included in linalg.

No, it's not ready to merge, but I don't understand how being an nDarray version of SVD necessarily leads to the conclusion that it's not ready. Doesn't it at least partly depend on the quality of the PR rather than being inherent to the subject of the PR? Or do you mean that something about the subject makes the standards much higher than usual?

I don't see any remaining technical or interface reservations above besides the ones I had. If there is a new problem, please explain.

@mdhaber
Copy link
Contributor
mdhaber commented Sep 18, 2024

@JanLuca can you reply to #14284 (review) and review the changes? Thanks!

@lucascolley
Copy link
Member

@JanLuca are you interested in returning to this PR? Branching for the next release is planned for December 6th.

@JanLuca
Copy link
Author
JanLuca commented Dec 2, 2024

@lucascolley Thanks for the ping. I am sorry that I missed the work since my last comment and forgot to get on this myself.

@mdhaber Thanks for the wonderful improvement of the PR. Looks very good to me and I would be fine with merging it!

@mdhaber
Copy link
Contributor
mdhaber commented Dec 2, 2024

Thanks @JanLuca. Can you reply to the first part of #14284 (review) about full_tensor/full_matrices. I don't think that works as expected.

@JanLuca
Copy link
Author
JanLuca commented Dec 2, 2024

Thanks @JanLuca. Can you reply to the first part of #14284 (review) about full_tensor/full_matrices. I don't think that works as expected.

You are right about that. I intended to implement something like the compact SVD but in this case, the full_tensor argument do not provide us what I intended. The solution to support something like a tolerance check is a good idea!

@mdhaber
Copy link
Contributor
mdhaber commented Dec 2, 2024

The solution to support something like a tolerance check is a good idea!

Thanks @JanLuca. We just need to figure out the details here, though.

  1. Assuming tol is replaced by an appropriate calculation (which probably depends on the spectrum and the floating point precision), is the math in the implementation correct for the "compact" HOSVD?
  2. Are there any suggestions in a HOSVD paper or other HOSVD implementation for the tolerance threshold?
  3. If not:
  • Would this calculation be appropriate here?
  • Should we expose a relative tolerance (similar to rcond) as an optional parameter?
  • Or instead of a boolean compact, should we require the user to provide a tolerance if they want a compact HOSVD, and this tolerance would be used to make the HOSVD compact? (The default could be 0 or even a negative number, so no singular values would fall below the threshold, and therefore the HOSVD would not be compact by default.)

@JanLuca
Copy link
Author
JanLuca commented Dec 4, 2024

I wrote up my math argument why the compact version is valid the way you implemented it:
hosvd.pdf

I have to recheck the HOSVD paper, will do so later.

But I would prefer your idea that the user has to supply a (relative) tolerance to cut off the singular values. I can easily imagine different settings where different thresholds are needed.

On a different note: I just realized that it may be better in many cases to construct $A A^\dagger$ and use the eigh routine to calculate the singular values and vectors. This would avoid the overhead to calculate the right singular vectors on the cost of the additional matrix multiplication. I will try later a benchmark if this would reduce the computational cost.

@JanLuca
Copy link
Author
JanLuca commented Dec 6, 2024

Ok, I checked the paper and did not find some indication for the tolerance calculation. So I would still vote for a user supplied one.

About the eigh idea: In my (surely not statistical significant) benchmark, it would be advantageous to use eigh instead of svd as method to calculate the vectors and values. Should I adept the code or do you want to do it @mdhaber ?

@mdhaber
Copy link
Contributor
mdhaber commented Dec 6, 2024

Should I adept the code or do you want to do it @mdhaber ?

I'd suggest getting this in as-is (with a tolerance to define compact SVD) and submitting a follow-up PR with any improvements. I think I can add that shortly. @lucascolley were you interested in reviewing, or just checking in?

@mdhaber
Copy link
Contributor
mdhaber commented Dec 6, 2024

Implement compact_rtol. @JanLuca is that what you had in mind?
Perhaps just rtol would be fine? I would prefer that, but I'll let the reviewer decide.
@jorenham If types are going to be added by scipy-stubs anyway, can we remove all the typing stuff from this PR to avoid the mypy complaints?

@jorenham
Copy link
Member
jorenham commented Dec 6, 2024

If types are going to be added by scipy-stubs anyway, can we remove all the typing stuff from this PR to avoid the mypy complaints?

Yea that works.
But you could also solve these issues by passing the missing type-arguments to np.ndarray for the shape and dtype, e.g. as np.ndarray[tuple[int, int], np.dtype[np.float64]] for a 2-d float64 array.

Copy link
Member
@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't interrogated the maths closely, but looks good to me!

@@ -299,6 +303,101 @@ def diagsvd(s, M, N):
raise ValueError("Length of s must be M or N.")


def higher_order_svd(
a: npt.ArrayLike, *, compact_rtol: float | None = None, check_finite: bool = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: perhaps more consistent with the referenced paper to make this an upper case A?

Suggested change
a: npt.ArrayLike, *, compact_rtol: float | None = None, check_finite: bool = True
A: npt.ArrayLike, *, compact_rtol: float | None = None, check_finite: bool = True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docs should then also be updated accordingly

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the same style as in the normal svd routine to keep it systematic.

F438
----------
a : array_like
Tensor to decompose
compact_rtol : float, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to stick with compact_rtol

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me, too! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a compromise with #14284 (comment), what do you think of renaming this to just rtol? Note that pinv and pinvh use atol and rtol for the threshold for significance of singular values, so maybe it is better to just roll with it rather than trying to be more specific. That said, orth and null_space, which are also closely related, use rcond. Perhaps we just call this rtol now, and after this is merged, open an issue about standardizing the name? We can use a decorator that will make it easy to change the name of rcond in a backward compatible way, if that's what we decide.

#14284 (comment) also gives the answer to my question from August. So one possibility is that we leave rtol required for now, then open an enhancement issue about adding a compact or similar argument: when compact=True, rtol would no longer be required, and the default could be computed according to the suggestion. We could add atol at that time, too. (The other option is doing both of these things now.)

With that in mind, new thoughts?

@ilayn
Copy link
Member
ilayn commented Dec 7, 2024

Compact typically refers to incomplete basis for rectangular SVD but here you are truncating so probably needs a different name.

We should also use the standard atol rtol keywords since we know the largest singular value. Then the default can be selected as 100max((a.shape))*eps for rtol and 0.0 for atol that will multiply the largest singular value.

Other than that looks ok to me but I couldn't have a chance to look at it throughly yet due to travels. But you seem to be on top of it so I don't have any other remarks.

@tylerjereddy
Copy link
Contributor

Do folks mind if we merge this right after branching instead of before? There's definitely some momentum with recent changes and checking the primary literature here, but I think I'd slightly prefer to merge it right after we branch.

If you're really confident enough to merge you can restore the original milestone. Branching is imminent though, a few hours away I think.

@tylerjereddy tylerjereddy modified the milestones: 1.15.0, 1.16.0 Dec 7, 2024
@mdhaber
Copy link
Contributor
mdhaber commented Dec 7, 2024

After branch is fine with me. I won't be merging it; I've written too much. I am willing to make remaining changes if we can agree on the "compact" behavior.


Examples
--------
>>> import numpy as np

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tupui made a comment about not needing to import numpy as np in examples. The comment is marked as outdated and resolved, but looks like this line is still an issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All examples now require this import

"""
a = _asarray_validated(a, check_finite=check_finite)
if compact_rtol is not None and compact_rtol < 0:
raise ValueError("`compact_rtol` must be a positive floating point number.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error says compact_rtol must be positive, but the if condition only checks that it is non-negative

Copy link
Member
@ev-br ev-br left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reading the comments, my impression is that

  • there's a general agreement that this is a good addition
  • there's an RM request to merge this just after the release split not just before it (the request was made before the previous release split but I suppose the spirit of the request fast-forwards)
  • there is a request to rename compat_rtol to just rtol (#14284 (comment)) and a suggestion for a path forward after the rename (#14284 (comment))

Does this sound like a right summary? If so, I suggest that we

  • rebase to solve conflicts
  • optionally address a couple of small suggestions below
  • press the green button right after scipy 1.16 branches off main
  • iterate on suggested improvements on compact=True
    ?

"""Higher-order SVD (HOSVD)

Factorizes the M-D tensor `a` into a list of M matrices ``U_k`` containing
the left singular vectors of the unfolded tensor with respect to the
Copy link
Member
@ev-br ev-br May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to admit I don't know off the the cuff what an unfolded tensor is.

Googling around gives https://jeankossaifi.com/blog/unfolding.html (which references "Tensor Decompositions and Applications"in SIAM REVIEW, 2009)) and https://docs.pytorch.org/docs/stable/generated/torch.nn.Unfold.html

Are these the same as here? If so, it'd be best to add a reference and give a definition, if possible. My first knee-jerk reaction was can we have a math-y definition in the Einstein tensor notation. If that's difficult, can we please have some explanation, a code example, and a link to the literature.

"A Multilinear Singular Value Decomposition"
SIAM Journal on Matrix Analysis and Applications,
21 (4), pp. 1253-1278. ISSN 1095-7162
https://doi.org/10.1137/S0895479896305696
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, trivial:

Suggested change
https://doi.org/10.1137/S0895479896305696
:doi:`10.1137/S0895479896305696`

LinAlgError
If SVD computation does not converge.

.. versionadded:: 1.15.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. versionadded:: 1.15.0
.. versionadded:: 1.17.0

@ev-br ev-br modified the milestones: 1.16.0, 1.17.0 May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0