-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: allow a duck-typed non NDArray sub-class to use the iterators #4240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Not sure what the problem is from the description here - I think you want If so then possibly the right way to do this is to add support for a
|
@njsmith yes..that's exactly what I am suggesting, doesn't seem to me that their is a way ATM. |
maybe we could try getting an indexing array from a pep3118 buffer, if Series already exposes the buffer pandas might not need changing. |
hm pandas 0.13 Series does not support the buffer interface anymore |
Yeah, the other option (which I think is a generalization of what you're On Wed, Jan 29, 2014 at 5:01 PM, Julian Taylor notifications@github.comwrote:
Nathaniel J. Smith |
@juliantaylor this was an intential change to not be a direct sub-class of ndarray any longer, only had a couple of minor incompatibilites: http://pandas.pydata.org/pandas-docs/dev/whatsnew.html#internal-refactoring. This brings the Series into line with other pandas objects (mainly DataFrame) so can share a lot of code Series supports completely the python interface (e.g. |
I'm pretty numpy's indexing code doesn't go through the higher-level On Wed, Jan 29, 2014 at 5:28 PM, jreback notifications@github.com wrote:
Nathaniel J. Smith |
@njsmith oh I meant users going thru Series higher level methods. It seems that their should be a way for numpy to say, hay you are not a sub-class (e.g. |
right, just call |
It works in my code branch ;). |
Anyway, if someone wants to attempt to fix it for a 1.8.x release or so. The fancy indexing code should be able to handle this fine (I am pretty sure it calls asarray even now), but the code deciding to trigger fancy indexing needs to be adjusted (you likely probably can use a Series, if you have another array in there somewhere ;)). I don't think it is possible to get everything quite consistent, but that probably doens't matter. |
Hmmm, spam... But doesn't isn't a Series a python sequence with a length and everything? I can plug in for example |
Actually... This is fun, because it doesn't quite work in my branch:
And we would have to find some logic for that, and I am not sure what it would be. |
@seberg Series implements |
is numpy calling any special method when trying to figure out what the indexer is? |
didn't check the code, but I would think it calls |
That said, my new code would basically just try:/except: the integer case (unless it is an array, where that would be potentially wrong currently). Though I will also double check the array at the end if it is a scalar. However, that part would not matter in how the code is currently written. Are we talking about fixing this in 1.9 or hacking something for 1.8? The bigger problem in both cases is that your Series here is boolean, but a list (and anything else that is not a plain boolean array) such as |
why would you not just try to I agree using this is subject to odd dtypes (it really should to be boolean or an int64 indexer array).
|
Because we were just going to do that after a deprecation period which would start in 1.9... |
ah..ok..np.... |
I somewhat read this issue first as "we have a serious problem here". If it is more a feature request and it is fine if it only works in 1.9 and the boolean case even later, then I think we can consider it handled for the moment. If this is creating serious problems out there, we need to think about how to best solve them, how bad the boolean side is, etc. |
its fine if it works in >= 1.9 the use case IMHO is pretty narrow; if you are already using Series you are almost always using them for most of the indexing, otherwise you can wrap as needed. that said; a Series behaves and looks like a 1-dim ndarray, so compat should be their for this case. this is one of those edge cases where a sub-class ndarray is different than a duck-typed ndarray |
My branch has been merged in master, so that So closing this. |
That said, I just realized there is a big "be careful" here:
This is because the index is interpreted as |
Or does anyone have a nice idea how to solve this ambiguity? In principle both solutions are correct after all... |
Shouldn't the rule be that we only have multiple indexes if type(key) ==
|
@njsmith, that would be nice, but even numpy tends to use lists to build these "tuples", though I suppose we could be brutal and deprecate that. Or only allow tuples and lists with this funky test. I don't like that length check anyway. Behaviour shouldn't work for large lists and suddenly fail if they get short, even if it probably doesn't matter in practice. |
@seberg thanks for all the effort on this! can you ref the commit for this change? |
If we've supported lists in the past then I guess we have to continue to do Anyway we def. shouldn't start allowing more non-tuple types to do this
|
@jreback there is no one commit, the PR is gh-3798, but it is basically all rewritten. https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/mapping.c#L186 is the start of the code branch deciding whether to treat a non-tuple as multiple indices or not (this is unchanged). Array likes will drop through all the checks until reaching https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/mapping.c#L392 at which point @njsmith, I did not change anything in that logic... But since sequences other then list or tuple subclasses which are used in such a way are probably non-existing, we could maybe deprecate at least anything but lists fast? |
beautiful...thank you much!
|
@jreback, well half beautiful for now, enable deprecation warnings to be shown... |
On Mon, Feb 17, 2014 at 6:23 PM, seberg notifications@github.com wrote:
Eh, I guess it doesn't much matter whether we deprecate lists+everything Nathaniel J. Smith |
It would require a couple of |
needs pandas 0.13 to fail
(fails with numpy 1.6 thru 1.9-dev)
Prior to pandas 0.13,
Series
WAS a sub-class ofNDArray
, so this worked as expected.Series
respects and implements virtually all of the interface to anNDArray
.This exception is coming from: https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/iterators.c#L867
I suspect that the
PyArray_Check(obj)
is not entered because Series is not a direct-sub class.If this check were relaxed a bit (or provide a way to emulate this type of check; which without hooking into the c-interface I don't think it possible ATM).
This was probably done for perf reasons I would guess.
thanks.
Jeff
related pandas issue: pandas-dev/pandas#6168
The text was updated successfully, but these errors were encountered: