8000 FIX make __iter__ consistent between sparse arrays and matrices by glemaitre · Pull Request #19220 · scipy/scipy · GitHub
[go: up one dir, main page]

Skip to content

FIX make __iter__ consistent between sparse arrays and matrices #19220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

glemaitre
Copy link
Contributor
@glemaitre glemaitre commented Sep 11, 2023

Reference issue

None

What does this implement/fix?

This PR is a workaround the current NotImplementedError:

In [1]: from scipy import sparse

In [2]: import numpy as np

In [3]: X = sparse.dok_array([[1, 2], [0, 3], [4, 0]])

In [4]: for row in X:
   ...:     print(row)
   ...: 
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[4], line 1
----> 1 for row in X:
      2     print(row)

File ~/mambaforge/envs/dev/lib/python3.10/site-packages/scipy/sparse/_base.py:252, in _spbase.__iter__(self)
    250 def __iter__(self):
    251     for r in range(self.shape[0]):
--> 252         yield self[r, :]

File ~/mambaforge/envs/dev/lib/python3.10/site-packages/scipy/sparse/_index.py:51, in IndexMixin.__getitem__(self, key)
     49     return self._get_intXint(row, col)
     50 elif isinstance(col, slice):
---> 51     self._raise_on_1d_array_slice()
     52     return self._get_intXslice(row, col)
     53 elif col.ndim == 1:

File ~/mambaforge/envs/dev/lib/python3.10/site-packages/scipy/sparse/_index.py:38, in IndexMixin._raise_on_1d_array_slice(self)
     30 """We do not currently support 1D sparse arrays.
     31 
     32 This function is called each time that a 1D array would
   (...)
     35 Once 1D sparse arrays are implemented, it should be removed.
     36 """
     37 if self._is_array:
---> 38     raise NotImplementedError(
     39         'We have not yet implemented 1D sparse slices; '
     40         'please index using explicit indices, e.g. `x[:, [0]]`'
     41     )

NotImplementedError: We have not yet implemented 1D sparse slices; please index using explicit indices, e.g. `x[:, [0]]`

It makes sparse arrays and sparse matrices behaving consistently in this regard.

@glemaitre glemaitre changed the title Sparse array compat iter FIX make __iter__ consistent between sparse arrays and matrices Sep 11, 2023
@j-bowhay j-bowhay added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.sparse 8000 labels Sep 11, 2023
@ogrisel
Copy link
Contributor
ogrisel commented Sep 11, 2023

The message is quite explicit that this has not yet been implemented yet (maybe for a good reason of maybe not wanting to implement this in a general because of the ambiguity as whether to return a dense or sparse array as result).

In scikit-learn we can probably always use the explicit indexing suggested in the message: X[:, [row_idx]] which would lead a 2D sparse array/matrix that we can then explicitly convert and squeeze into a dense numpy array (X[:, [row_idx]].toarray().flatten()) if we ever need to.

@glemaitre
Copy link
Contributor Author
glemaitre commented Sep 11, 2023

The message is quite explicit that this has not yet been implemented yet (maybe for a good reason of maybe not wanting to implement this in a general because of the ambiguity as whether to return a dense or sparse array as result).

Indeed, I missed the point of the 1D specification. I assume that I got sidetrack by the current behaviour of the csr_array:

In [8]: X = sparse.csr_array([[1, 2], [0, 3], [4, 0]])

In [9]: for row in X:
   ...:     print(type(row))
   ...: 
<class 'scipy.sparse._csr.csr_array'>
<class 'scipy.sparse._csr.csr_array'>
<class 'scipy.sparse._csr.csr_array'>

In [10]: for row in X:
    ...:     print(row.shape)
    ...: 
(1, 2)
(1, 2)
(1, 2)

I would also expect a NotImplementedError instead of getting a 2-D array.

@perimosocordiae
Copy link
Member

Yeah, this is a tough one. Since we're pretty close to having a 1-d COO format in #18530, I think I'd rather wait until we can return a proper 1d sparse row.

@perimosocordiae perimosocordiae removed their request for review September 11, 2023 13:29
@glemaitre
Copy link
Contributor Author

I am closing then. @perimosocordiae is there a plan to actually return a 1d for CSR/CSC arrays?

@glemaitre glemaitre closed this Sep 11, 2023
@perimosocordiae
Copy link
Member

It's not official yet, but personally I think it makes sense to return 1d COO format for CSR/CSC indexing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
0