-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
ENH: stats: Add matrix variate t distribution #22925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… that frozen instance checks for singularity of spread matrices.
- Completed `test_bad_input` - Stubs of other tests
- Adds unit tests for bad input and default inputs and covariance expansion to matrix variate t test class. - Fixes bugs in matrix variate t distribution which were exposed by those tests.
- Adds unit tests for array input to matrix variate t test class. - Fixes bugs in matrix variate t distribution which were exposed by those tests. _logpdf did not expect to receive an array input with > 3 dimensions and its Einstein sum is defined based on that expectation. So, if there are > 3 dimensions then _logpdf will flatten out the extra ones, perform the calculation, then reshape before returning.
Covers matrix normal, matrix t, and inverse Wishart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you made a solid effort to follow our usual practices. I added a few surface-level comments ahead of the review by stats
regulars/domain experts.
…t and remove leftover debugging lines in `_multivariate.py`
…ix_t_gen._process_parameters`
Is this PR still active? Some tests show as failed, but I’m unsure how to resolve them on my end. I’m happy to make any necessary changes to fix them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the lint failures shown at https://github.com/scipy/scipy/actions/runs/14981775469/job/42087379243?pr=22925 are an easy fix—they are just due to some lines being too long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very promising, thank you @johnmdusel . Could you add references to the paper in the docstring?
About tests: how can we evaluate that this is indeed the matrix t distribution? Can we test the PDF method against some external reference? Or is there a special case where this reduces to a distribution we already have implemented?
Yes, I will add the reference to Gupta and Nagar.
When I was developing this class, I tested the |
Motivated by their role in the `rvs` method
…terface consistency Post `0dceecb` they need to be `row_spread=1` and `col_spread=1` instead of `None`. This is how they're defined in `matrix_normal_gen.__call__`
I'm limited to using Mathematica version "14.2.1 for Linux x86 (64-bit) (March 16, 2025)". When I execute that code, I don't get the sample sample or pdf values. |
Hm, unfortunate. Can we do a minimal test with a fixed set of matrices to test the pdf values if we cannot rely on the random variates? I do not expect differences in the first few digits between different mathematica versions. @johnmdusel @mdhaber |
@dschmitz89 I could set up a minimal Julia script to generate random variates and compute PDF values. Would that be alright, as an alternative to Mathematica? I would provide a Dockerfile and a |
Oops that was a misclick. I'll see if I can just verify the PDfF values @johnmdusel generated. |
@dschmitz89 @mdhaber I added a unit test that checks PDF values against Julia. The source in the test's docstring and this small repo. All tests pass locally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy with this PR and the pdf values were also confirmed against an external source (Mathematica). There is one last stylistic nit regarding tests: we typically use relative tolerances, not absolute ones. Will wait for other reviewers before merging.
@rgommers The tests against Julia and Mathematica are now based on a handful of reference values, instead of a file. |
@dschmitz89 -- questions about the failing CI checks above
|
Let's go with the 5% tolerance for simplicity. Sry about the churn.
That one is already fixed in main. |
No problem! I made the change. |
@rgommers would you be able to approve if no further changes are required? On the other hand, I'm happy to make further changes as needed. Thank you very much! |
Reference issue
Closes issue #22870.
What does this implement/fix?
Background explanation in this post on the developer forum
Additional information
From the PR checklist:
This code can be distributed under a BSD license. I followed the existing random matrix implementation for data validation and wrote the methods from scratch.
I provided unit tests in
stats.tests.test_multivariate.TestMatrixT
. These are mostly simiar to the ones formatrix_normal
, except that the moment checking and sample checking also perform a spot-check of two basic theorems from Gupta and Nagar (2000). All unit tests pass locally.Re: style: I used
black
to format my additions tostats._multivariate
. For unit tests and benchmarking I followed the style of existing code.I provided benchmarks in
stats.MatrixSampling
. This also includes benchmarks forrvs
methods ofmatrix_normal
andinvwishart
.All public functions have docstrings with examples, and I've verified that the documentation builds locally and renders correctly.