-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
MAINT: Make the refactor suggested in prepare_index #8278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
87e1294
bfd70c2
37e6e1f
3a0ad85
8c4e556
5d9315c
966b2c7
9832f05
68bad6a
c587963
105e0b4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -139,6 +139,156 @@ PyArray_MapIterSwapAxes(PyArrayMapIterObject *mit, PyArrayObject **ret, int getm | |
*ret = (PyArrayObject *)new; | ||
} | ||
|
||
/** | ||
* Turn an index argument into a c-array of `PyObject *`s, one for each index. | ||
* | ||
* When a scalar is passed, this is written directly to the buffer. When a | ||
* tuple is passed, the tuple elements are unpacked into the buffer. | ||
* | ||
* When some other sequence is passed, this implements the following section | ||
* from the advanced indexing docs to decide whether to unpack or just write | ||
* one element: | ||
* | ||
* > In order to remain backward compatible with a common usage in Numeric, | ||
* > basic slicing is also initiated if the selection object is any non-ndarray | ||
* > sequence (such as a list) containing slice objects, the Ellipsis object, | ||
* > or the newaxis object, but not for integer arrays or other embedded | ||
* > sequences. | ||
* | ||
* @param index The index object, which may or may not be a tuple. This is a | ||
* borrowed reference. | ||
* @param result An empty buffer of 2*NPY_MAXDIMS PyObject* to write each | ||
* index component to. The references written are borrowed. | ||
* This function will in some cases clobber the array beyond | ||
* the last item unpacked. | ||
* | ||
* @returns The number of items in `result`, or -1 if an error occured. | ||
*/ | ||
NPY_NO_EXPORT npy_intp | ||
unpack_indices(PyObject *index, PyObject *result[2*NPY_MAXDIMS]) | ||
{ | ||
npy_intp n, i; | ||
npy_bool commit_to_unpack; | ||
|
||
/* fast route for passing a tuple */ | ||
if (PyTuple_CheckExact(index)) { | ||
n = PyTuple_GET_SIZE(index); | ||
if (n > NPY_MAXDIMS * 2) { | ||
PyErr_SetString(PyExc_IndexError, | ||
"too many indices for array"); | ||
return -1; | ||
} | ||
for (i = 0; i < n; i++) { | ||
result[i] = PyTuple_GET_ITEM(index, i); | ||
} | ||
return n; | ||
} | ||
|
||
/* Obvious single-entry cases */ | ||
if (0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As instructed here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, with those There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should optimize out anyway. Could be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nah, its fine, obvious enough, just tried to style nitpick and it didn't work too well ;) |
||
#if !defined(NPY_PY3K) | ||
|| PyInt_CheckExact(index) | ||
#else | ||
|| PyLong_CheckExact(index) | ||
#endif | ||
|| index == Py_None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. brackets? |
||
|| PySlice_Check(index) | ||
|| PyArray_Check(index) | ||
|| !PySequence_Check(index)) { | ||
|
||
result[0] = index; | ||
return 1; | ||
} | ||
|
||
/* passing a tuple subclass - needs to handle errors */ | ||
if (PyTuple_Check(index)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The code here is subtly different to what we had before. Calling So tuple subclasses that implement those methods inconsistently now behave differently. For instance: class Plus1Tuple(tuple):
def __getitem__(self, i):
return super().__getitem__(i) + 1
# oops, we forgot `__iter__`, and inherited `tuple.__iter__`, which
# does not fall back on __getitem__ gives:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't you just remove this whole block and replace it with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, plus a check for too many indices. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am fine with the changed behaviour I think, a tuple subclass should really be OK even with just using PyTuple_GET_ITEM to be honest, otherwise it should be considered badly broken. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we should just bite the bullet here and call There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea, I suppose we might just as well put whatever was there before, it won't make the code any uglier and speed really does not matter. But I no need to add a test or so (or if, put a comment that it is fine to break it). This is too strange to really worry about. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, changed to call |
||
n = PySequence_Size(index); | ||
if (n < 0) { | ||
return -1; | ||
} | ||
if (n > NPY_MAXDIMS * 2) { | ||
PyErr_SetString(PyExc_IndexError, | ||
"too many indices for array"); | ||
return -1; | ||
} | ||
for (i = 0; i < n; i++) { | ||
result[i] = PySequence_GetItem(index, i); | ||
if (result[i] == NULL) { | ||
return -1; | ||
} | ||
} | ||
return n; | ||
} | ||
|
||
/* At this point, we're left with a non-tuple, non-array, sequence: | ||
* typically, a list | ||
* | ||
* Sequences < NPY_MAXDIMS with any slice objects | ||
* or newaxis, Ellipsis or other arrays or sequences | ||
* embedded, are considered equivalent to an indexing | ||
* tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`) | ||
*/ | ||
|
||
/* if len fails, treat like a scalar */ | ||
n = PySequence_Size(index); | ||
if (n < 0) { | ||
PyErr_Clear(); | ||
result[0] = index; | ||
return 1; | ||
} | ||
|
||
/* for some reason, anything that's long but not too long is turned into | ||
* a single index. The *2 is missing here for backward-compatibility. */ | ||
if (n >= NPY_MAXDIMS) { | ||
result[0] = index; | ||
return 1; | ||
} | ||
|
||
/* Some other type of short sequence - assume we should unpack it like a | ||
* tuple, until we find something that proves us wrong */ | ||
There was a problem hiding this comment. Choose a reason for hiding this D958 commentThe reason will be displayed to describe this comment to others. Learn more. Nitpick: comment should be reformatted (also all other multiline comments) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What format should I be aiming for? I was under the impression this was correct, but wasn't really paying attention There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cheers, and done |
||
commit_to_unpack = 0; | ||
for (i = 0; i < n; i++) { | ||
PyObject *tmp_obj = result[i] = PySequence_GetItem(index, i); | ||
|
||
if (commit_to_unpack) { | ||
/* propagate errors */ | ||
if (tmp_obj == NULL) { | ||
return -1; | ||
} | ||
} | ||
else { | ||
/* if getitem fails (unusual) before we've committed, then | ||
* commit to not unpacking */ | ||
if (tmp_obj == NULL) { | ||
PyErr_Clear(); | ||
break; | ||
} | ||
|
||
/* decide if we should treat this sequence like a tuple */ | ||
if (PyArray_Check(tmp_obj) | ||
|| PySequence_Check(tmp_obj) | ||
|| PySlice_Check(tmp_obj) | ||
|| tmp_obj == Py_Ellipsis | ||
|| tmp_obj == Py_None) { | ||
There was a problem hiding this comment. Choose a reason for hiding this comm F438 entThe reason will be displayed to describe this comment to others. Learn more. Again, I think we usually put brackets, but no big deal There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Don't agree - we use brackets to make precedence of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, frankly don't care much, its not |
||
commit_to_unpack = 1; | ||
} | ||
} | ||
|
||
Py_DECREF(tmp_obj); | ||
} | ||
|
||
/* unpacking was the right thing to do, and we already did it */ | ||
if (commit_to_unpack) { | ||
return n; | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just what I think right now (don't worry about it), but I would remove the blank line at least to make the else stick to the if block. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, since this was as much accidental as anything else |
||
/* got to the end, never found an indication that we should have unpacked */ | ||
else { | ||
/* we already filled result, but it doesn't matter */ | ||
result[0] = index; | ||
return 1; | ||
} | ||
} | ||
|
||
/** | ||
* Prepare an npy_index_object from the python slicing object. | ||
|
@@ -174,89 +324,17 @@ prepare_index(PyArrayObject *self, PyObject *index, | |
int i; | ||
npy_intp n; | ||
|
||
npy_bool make_tuple = 0; | ||
PyObject *obj = NULL; | ||
PyArrayObject *arr; | ||
|
||
int index_type = 0; | ||
int ellipsis_pos = -1; | ||
|
||
/* | ||
* The index might be a multi-dimensional index, but not yet a tuple | ||
* this makes it a tuple in that case. | ||
* | ||
* TODO: Refactor into its own function. | ||
*/ | ||
if (!PyTuple_CheckExact(index) | ||
/* Next three are just to avoid slow checks */ | ||
#if !defined(NPY_PY3K) | ||
&& (!PyInt_CheckExact(index)) | ||
#else | ||
&& (!PyLong_CheckExact(index)) | ||
#endif | ||
&& (index != Py_None) | ||
&& (!PySlice_Check(index)) | ||
&& (!PyArray_Check(index)) | ||
&& (PySequence_Check(index))) { | ||
/* | ||
* Sequences < NPY_MAXDIMS with any slice objects | ||
* or newaxis, Ellipsis or other arrays or sequences | ||
* embedded, are considered equivalent to an indexing | ||
* tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`) | ||
*/ | ||
|
||
if (PyTuple_Check(index)) { | ||
/* If it is already a tuple, make it an exact tuple anyway */ | ||
n = 0; | ||
make_tuple = 1; | ||
} | ||
else { | ||
n = PySequence_Size(index); | ||
} | ||
if (n < 0 || n >= NPY_MAXDIMS) { | ||
n = 0; | ||
} | ||
for (i = 0; i < n; i++) { | ||
PyObject *tmp_obj = PySequence_GetItem(index, i); | ||
/* if getitem fails (unusual) treat this as a single index */ | ||
if (tmp_obj == NULL) { | ||
PyErr_Clear(); | ||
make_tuple = 0; | ||
break; | ||
} | ||
if (PyArray_Check(tmp_obj) || PySequence_Check(tmp_obj) | ||
|| PySlice_Check(tmp_obj) || tmp_obj == Py_Ellipsis | ||
|| tmp_obj == Py_None) { | ||
make_tuple = 1; | ||
Py_DECREF(tmp_obj); | ||
break; | ||
} | ||
Py_DECREF(tmp_obj); | ||
} | ||
|
||
if (make_tuple) { | ||
/* We want to interpret it as a tuple, so make it one */ | ||
index = PySequence_Tuple(index); | ||
if (index == NULL) { | ||
return -1; | ||
} | ||
} | ||
} | ||
PyObject *raw_indices[NPY_MAXDIMS*2]; | ||
|
||
/* If the index is not a tuple, handle it the same as (index,) */ | ||
if (!PyTuple_CheckExact(index)) { | ||
obj = index; | ||
index_ndim = 1; | ||
} | ||
else { | ||
n = PyTuple_GET_SIZE(index); | ||
if (n > NPY_MAXDIMS * 2) { | ||
PyErr_SetString(PyExc_IndexError, | ||
"too many indices for array"); | ||
goto fail; | ||
} | ||
index_ndim = (int)n; | ||
obj = NULL; | ||
index_ndim = unpack_indices(index, raw_indices); | ||
if (index_ndim == -1) { | ||
return -1; | ||
} | ||
|
||
/* | ||
|
@@ -275,14 +353,7 @@ prepare_index(PyArrayObject *self, PyObject *index, | |
goto failed_building_indices; | ||
} | ||
|
||
/* Check for single index. obj is already set then. */ | ||
if ((curr_idx != 0) || (obj == NULL)) { | ||
obj = PyTuple_GET_ITEM(index, get_idx++); | ||
} | ||
else { | ||
/* only one loop */ | ||
get_idx += 1; | ||
} | ||
obj = raw_indices[get_idx++]; | ||
|
||
/**** Try the cascade of possible indices ****/ | ||
|
||
|
@@ -686,20 +757,13 @@ prepare_index(PyArrayObject *self, PyObject *index, | |
*ndim = new_ndim + fancy_ndim; | ||
*out_fancy_ndim = fancy_ndim; | ||
|
||
if (make_tuple) { | ||
Py_DECREF(index); | ||
} | ||
|
||
return index_type; | ||
|
||
failed_building_indices: | ||
for (i=0; i < curr_idx; i++) { | ||
Py_XDECREF(indices[i].object); | ||
} | ||
fail: | ||
if (make_tuple) { | ||
Py_DECREF(index); | ||
} | ||
return -1; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments are very helpful, but here maybe be even more explicit and say "beyond the number of items returned".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do - I was struggling to phrase this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've removed that remark entirely, as it made more sense under the description of the return value: