8000 Inverse of singular matrix. Should be error but in fact not. · Issue #10471 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Inverse of singular matrix. Should be error but in fact not. #10471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
noob-saibot opened this issue Jan 25, 2018 · 11 comments
Closed

Inverse of singular matrix. Should be error but in fact not. #10471

noob-saibot opened this issue Jan 25, 2018 · 11 comments

Comments

@noob-saibot
Copy link
noob-saibot commented Jan 25, 2018

Hello.
I've got strange behavior for experiments. I'm working with matrix (for example b) that in result of multiplying b.T * b should be singular matrix and for inverse method should be arisen error like numpy.linalg.linalg.LinAlgError: Singular matrix. But result was high/low values.
Code below:

>>> b = np.matrix([[1,1,0], [1,0,1], [1,1,0]])
>>> b
matrix([[1, 1, 0],
        [1, 0, 1],
        [1, 1, 0]])
>>> np.linalg.inv(b.T * b)
matrix([[ 4.50359963e+15, -4.50359963e+15, -4.50359963e+15],
        [-4.50359963e+15,  4.50359963e+15,  4.50359963e+15],
        [-4.50359963e+15,  4.50359963e+15,  4.50359963e+15]])

How can be avoided this behavior?
Tests on:
win10, Python 3.5.4, numpy version '1.14.0'.
ubuntu 16.04, Python 3.5.2, numpy version '1.13.3' and '1.14.0'.

PS. I've checked via wolfram and R it's real singular matrix.

@eric-wieser
Copy link
Member
eric-wieser commented Jan 27, 2018

For what it's worth:

>>> b.T * b
matrix([[3, 2, 1],
        [2, 2, 0],
        [1, 0, 1]])

which as you correctly observe, is clearly singular.

On my machine with lapack_lite (3.2.2), I get:

>>> np.linalg.inv(b.T * b)
matrix([[  5.40431955e+15,  -5.40431955e+15,  -5.40431955e+15],
        [ -5.40431955e+15,   5.40431955e+15,   5.40431955e+15],
        [ -5.40431955e+15,   5.40431955e+15,   5.40431955e+15]])

In general I wouldn't expect precise results from a floating point inversion algorithm though. You can always check the determinant before trying to invert:

>>> np.linalg.det((b.T * b))
3.7007434154171906e-16
>>> np.finfo(float).eps  # for comparison
2.2204460492503131e-16

(edit: my linear algebra is out of practice, cond is far better)

If you use a float matrix rather than an integer matrix in R, does it fare any worse?

@njsmith
Copy link
Member
njsmith commented Jan 27, 2018

The condition number (np.linalg.cond) might be a more appropriate tool than the determinant here.

@pv
Copy link
Member
pv commented Jan 27, 2018

duplicate of gh-6027, gh-6178 (and several closed issues)

@noob-saibot
Copy link
Author
noob-saibot commented Jan 29, 2018

If you use a float matrix rather than an integer matrix in R, does it fare any worse?

R raise error for float numbers too.

Workaround is checking every matrix via np.linalg.cond before inverting. Am I right?
So for example any projects what based on numpy back-end have to add this checker rather than back-end would return correct result?

@njsmith
Copy link
Member
njsmith commented Jan 29, 2018

@noob-saibot This isn't a numpy problem, this is a general problem for anyone doing numerical linear algebra on a computer. You will see the same thing in R, depending on the exact matrices you use and depending on how your R was built. In fact in general numpy and R use the same code to perform a matrix inversion like this.

For computational purposes, there's no meaningful difference between a matrix that's not invertible (condition number is infinite), and one where the condition number is merely very large.

@noob-saibot
Copy link
Author
noob-saibot commented Jan 30, 2018

As I see at code def inv(a): developers check matrix on rank and squareness. Cant they add np.linalg.cond check? Is it large performance usage? If it's true then no problem.

Also for my understanding. Problem in lapack dgesv function, is it? The LU decomposition return correct output, so we dont detect singularity.

Thank for answers:)

@pmli
Copy link
Contributor
pmli commented Feb 2, 2018

@noob-saibot You could use scipy.linalg.solve, which raises a warning when it detects an ill-conditioned matrix (since version 0.19.0, if I'm not mistaken):

>>> import numpy as np
>>> import scipy.linalg as spla
>>> b = np.array([[1,1,0], [1,0,1], [1,1,0]])
>>> spla.solve(b.T.dot(b), np.eye(3))
8000

/some/long/path/scipy/linalg/basic.py:40: RuntimeWarning: scipy.linalg.solve
Ill-conditioned matrix detected. Result is not guaranteed to be accurate.
Reciprocal condition number/precision: 1.2335811384723961e-17 / 1.1102230246251565e-16
  RuntimeWarning)
array([[ 4.50359963e+15, -4.50359963e+15, -4.50359963e+15],
       [-4.50359963e+15,  4.50359963e+15,  4.50359963e+15],
       [-4.50359963e+15,  4.50359963e+15,  4.50359963e+15]])

The issue in numpy.linalg.inv is that the LU decomposition doesn't give the exact solution, which you can see from:

>>> P, L, U = spla.lu(b.T.dot(b))
>>> U
array([[ 3.00000000e+00,  2.00000000e+00,  1.00000000e+00],
       [ 0.00000000e+00,  6.66666667e-01, -6.66666667e-01],
       [ 0.00000000e+00,  0.00000000e+00,  2.22044605e-16]])

Since all diagonal elements in U are nonzero, numpy.linalg.inv continues without raising an error. If it used GESVX, as scipy.linalg.solve does, it could relatively cheaply compute an estimate of a condition number.

But, do you really need inv for what you are doing? If, e.g., you need to solve a least squares problem, there are better methods then using normal equations.

@noob-saibot
Copy link
Author

But, do you really need inv for what you are doing? If, e.g., you need to solve a least squares problem, there are better methods then using normal equations.

Thank you for detailed answer. Yes for OLS calculating. I'll read about another methods.

@Gnoudini
Copy link

noob-saibot commented on 30 Jan 2018:

As I see at code def inv(a): developers check matrix on rank and squareness. …

Actually, they don't check on rank, even though they call their function _assertRankAtLeast2(a), they just check the dimensionality of the array. The matrix rank, by the way, would be a perfect measure to detect singularity.

@charris
Copy link
Member
charris commented Oct 31, 2019

Different meaning of "rank". In old Numeric and NumPy, rank referred to the number of dimensions, not the number of independent row/columns. We have removed that usage most places and should probably rename the function. The inv function could also be improved, but it only fails if the matrix is numerically singular, which is a different thing from exact singularity. You need to check the algebraic rank for yourself using matrix_rank, or possibly use a different approach if it lends itself to your application.

@eric-wieser
Copy link
Member

Let's rename these functions to not use the word rank then: #14814

eric-wieser added a commit to eric-wieser/numpy that referenced this issue Oct 31, 2019
As shown in numpygh-10471, this naming was confusing.

Also rename all the functions of this style to use snake case
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants
0