8000 Add routines for updating QR decompositions by ewmoore · Pull Request #4249 · scipy/scipy · GitHub
[go: up one dir, main page]

Skip to content

Add routines for updating QR decompositions #4249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Apr 12, 2015
Merged

Conversation

ewmoore
Copy link
Member
@ewmoore ewmoore commented Dec 9, 2014

This pull request goes on top of #4021, accordingly only the last few commits are relevant here.

Here three new functions are added to perform various modifications of qr decompositions: qr_update, qr_insert and qr_delete for performing rank-k updates, adding row(s) or column(s) and removing row(s) or column(s).

The methods used are essentially those in [1], with some input from [2].

I've endeavored to write this such that updates can be performed in-place as much as possible. This is, for instance, why the rank-1 update and the rank-k update algorithms are slightly different. This allows the rank-1 update to support updating q and r when they are stored in C order. (In this spirit, it might be worth changing linalg.qr to return fortran ordered 'r', since at present it always returns a C ordered r, and a F ordered q. Numpy's qr returns both as C order.)

Edit: both are done (3/18/15)
Some outstanding things:

  1. I suppose I'll need to add a 'check_finite' flag to these.
  2. overwrite_qr should probably default to False instead of true.

References

  1. Golub, G. H. & Loan, C. F. van V. Matrix Computations, 3rd Ed. (Johns Hopkins University Press, 1996).
  2. Hammarling, S. & Lucas, C. Updating the QR factorization and the least squares problem. 1-73 (The University of Manchester, 2008). at http://eprints.ma.man.ac.uk/1192/

@josef-pkt
Copy link
Member

I don't have much idea about the details.

One question:
Q in scipy.linalg.qr is (M, M), or (M, K) for mode='economic'.

The update routines require (M,M) according to the docstring. Does this also work for the economic (M, K) Q-matrix?

In the stats applications M is often much larger than K (e.g. M,K = 10000, 50

Another question: Is there anything about singular matrices? How does it behave? Is it supposed to work or is it ruled out? Is it stable in the near singular case, or is it better to regularize in that case on the user side?

@josef-pkt
Copy link
Member

And, Thanks for working on this. There will be many use cases where this is useful to speed up the calculations for example in statsmodels.


p_subdiag_qr(m, n-p, q, qs, r, rs, k, p, work)

libc.stdlib.free(work)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cython docs, http://docs.cython.org/src/tutorial/memory_allocation.html, seem to recommend a try-finally block for free. For my education: is it that it's actually not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure. It seems to me that if it does, there must be tons of wrapped code that can leak memory in certain circumstances (multiple threads etc.), given that this compiles down to a simple C function. Perhaps some one that knows for certain can weigh in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on whether something in between can raise or not. In this case,
since it's a nogil function, no exceptions can be raised.

@ewmoore
Copy link
Member Author
ewmoore commented Dec 10, 2014

Nothing works with economy decompositions right now. I don't know of any reason why it couldn't be made to work though. I stopped here mostly because this is already 2000+ lines and that seemed big enough for a first cut at this.

There is nothing here that does anything in particular with singular matrices. Do you have anything in particular in mind? Or a reference?

In [1]: import numpy as np

In [2]: from scipy import linalg

In [3]: np.set_printoptions(precision=4, suppress=True)

In [4]: # a is singular

In [5]: a = np.array([[1, -1, 2],[2,1,1],[1,1,0]], 'd')

In [6]: qa, ra = linalg.qr(a)

In [7]: # make some update vectors

In [8]: u = np.random.randn(3)

In [9]: v = np.random.randn(3)

In [10]: qau, rau = linalg.qr_update(qa, ra, u, v, False)

In [11]: qau
Out[11]:
array([[ 0.0969, -0.9629, -0.2517],
       [ 0.8605, -0.046 ,  0.5073],
       [ 0.5001,  0.2658, -0.8242]])

In [12]: rau
Out[12]:
array([[ 1.5429,  0.4398,  2.8576],
       [ 0.    ,  2.0214, -3.8075],
       [ 0.    ,  0.    , -0.1387]])

In [13]: qa
Out[13]:
array([[ 0.4082, -0.8729, -0.2673],
       [ 0.8165,  0.2182,  0.5345],
       [ 0.4082,  0.4364, -0.8018]])

In [14]: ra
Out[14]:
array([[ 2.4495,  0.8165,  1.633 ],
       [ 0.    ,  1.5275, -1.5275],
       [ 0.    ,  0.    ,  0.    ]])

In [15]: # b is non singular

In [16]: b = a - np.outer(u, v)

In [17]: b
Out[17]:
array([[ 1.8505, -0.0962,  0.0217],
       [ 2.6723,  1.7145, -0.5639],
       [ 1.2284,  1.2428, -0.5313]])

In [18]: qb, rb = linalg.qr(b)

In [19]: qb
Out[19]:
array([[ 0.5325, -0.7994, -0.2781],
       [ 0.769 ,  0.3198,  0.5535],
       [ 0.3535,  0.5086, -0.7851]])

In [20]: rb
Out[20]:
array([[ 3.4748,  1.7067, -0.6099],
       [ 0.    ,  1.2572, -0.4679],
       [ 0.    ,  0.    ,  0.099 ]])

In [21]: qbu, rbu = linalg.qr_update(qb, rb, u, v, False)

In [22]: qbu
Out[22]:
array([[ 0.4082, -0.8729, -0.2673],
       [ 0.8165,  0.2182,  0.5345],
       [ 0.4082,  0.4364, -0.8018]])

In [23]: rbu
Out[23]:
array([[ 2.4495,  0.8165,  1.633 ],
       [ 0.    ,  1.5275, -1.5275],
       [ 0.    ,  0.    , -0.    ]])

In [24]:

@josef-pkt
Copy link
Member

Nothing works with economy decompositions right now. I don't know of any reason why it couldn't be made to work though. I stopped here mostly because this is already 2000+ lines and that seemed big enough for a first cut at this.

Ok, understandable. The increased memory handling might eat away all advantages for efficient updatting for most of our (statsmodels) application.

There is nothing here that does anything in particular with singular matrices. Do you have anything in particular in mind? Or a reference?

This was mainly a usage question. 2 issues when I was reading around (skimming many articles) for this.
Downdating is numerically less stable than updating, the references mentioned a special rotation for downdating that I didn't try to figure out. My main take-away was to be careful when using the qr if the downdating results in a singular matrix that has however numerical noise.
The second issue is, I think, a pure application issue. IIRC (?) there were some problems with updating singular matrices and I've seen it handled by adding a small value to the diagonal (a tiny Ridge regularization).

@I--P I--P mentioned this pull request Dec 19, 2014
@ewmoore
Copy link
Member Author
ewmoore commented Dec 19, 2014

I've pushed support for rank-k updates to economic mode decompositions.

Right now, the update is A + uv**T, but it would only be a tiny change to support A + b*uv**T where b is a scalar. Ultimately these are the same, since b can be folded into u or v, but I'm explicitly using a 1 for b right now and could easily expose this.

@ev-br
Copy link
Member
ev-br commented Dec 19, 2014

Can these be generalized to handle A + buv**T with a diagonal matrix b?

@ewmoore
Copy link
Member Author
ewmoore commented Dec 19, 2014

I suppose it could be. In that case I'd end up scaling the rows of u by the diagonal entries of b just about exactly as you would do by hand.

The general approach to performing updates is to reduce q.T.dot(u) to a scalar using a unitary matrix then add v times this scalar to the first row of r. This give an easy place to put a scalar b, but there isn't a similar place for a diagonal matrix b. That being said, If you have a good use case it's straightforward and might be worth doing.

@ev-br
Copy link
Member
ev-br commented Dec 19, 2014

Yup, you're right of course. No point complicating the API IMO for something which is easy to by hand.

@rgommers rgommers added enhancement A new feature or improvement scipy.linalg labels Dec 20, 2014
@rgommers
Copy link
Member

Failing test seems to need atol=1e-7.

@argriffing
Copy link
Contributor

Is this PR still waiting for the changes suggested in #4021 (comment)?

By the way, I noticed that the code includes a workaround for a numpy 1.5.1 limitation. Now that support for numpy versions < 1.6 have been dropped, this workaround could be removed.

@ewmoore
Copy link
Member Author
ewmoore commented Jan 7, 2015

Mostly. It still needs a little work on updating economic decomposition for
column additions. But since as it is now, it depends on #4021, it can't be
merged until that is finished. It does not have to work that way. Lapack
is always available when building scipy so this could just link directly if
desired.

Some code review would also be good. Even though there haven't been any
comments, I'll bet there are at least some style issues if not actual code
issues.

On Wednesday, January 7, 2015, argriffing notifications@github.com wrote:

Is this PR still waiting for the changes suggested in #4021 (comment)
#4021 (comment)?

By the way, I noticed that the code includes a workaround for a numpy
1.5.1 limitation. Now that support for numpy versions < 1.6 have been
dropped, this workaround could be removed.


Reply to this email directly or view it on GitHub
#4249 (comment).

for 1D arrays these are equivalent. This is Numpy's gh-2287, fixed in
9b8ff38.

FIXME: Is it worth only applying this for numpy 1.5.1?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That commit is in numpy 1.6.2, which is now the lowest supported version. So this can go.

@ewmoore ewmoore force-pushed the updateqr2 branch 2 times, most recently from 7381603 to da0616f Compare April 6, 2015 15:23
@ewmoore
Copy link
Member Author
ewmoore commented Apr 6, 2015

Rebased.

@rgommers
Copy link
Member

Okay, 6 days since the message on the mailing list that this is about to be merged. Time to green-button it I think, then it gets to sit in master for a while before branching for the next release.

rgommers pushed a commit that referenced this pull request Apr 12, 2015
ENH: add routines for updating QR decompositions
@rgommers rgommers merged commit 02ca7d2 into scipy:master Apr 12, 2015
@rgommers
Copy link
Member

Or better, it could be used inside signal.place_poles already. @I--P, this is now available.

@I--P
Copy link
Contributor
I--P commented Apr 12, 2015

@rgommers Nice ! I'm not sure I'll have time to update place-poles with it before 0.16 though.

@rgommers
Copy link
Member

No worries. If you may not be able to do that for a long time, it may be an idea to open an enhancement issue that explains what needs to be done.

@argriffing
Copy link
Contributor

Should _decomp_update.pyx be added to .gitignore? I think the input file is named _decomp_update.pyx.in, and _decomp_update.c is already ignored.

@rgommers
Copy link
Member

Indeed. Want to fix that, or should I?

@argriffing
Copy link
Contributor

You can fix it while I let my scipy build recover from the numpy deprecation flag incompatibility :)

@rgommers
Copy link
Member

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0