-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Override column __getitem__ to return int item as ndarray not column #3095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Just a reminder about the discussion in #2790 and in separate e-mails, that @astrofrog and I argued for My tendency would be to always return a |
I don't have a strong opinion about this either way. There has to be some default behavior for |
@mhvk - just one clarification -
so by symmetry we should get the same for:
I think that just the symmetry is enough to show that we can't get a column if we slice a column, because we don't get a row if we slice a row, and we want to make sure that we get the same whether we access the rows first or columns first. |
Yes, the current behaviour is consistent in the sense you imply; the only inconsistency is columns holding multiple dimensions; my Anyway, I'll let those speak who are more likely to use columns in their present form -- I'll be moving fully to |
Just so it is clear the current behaviour is most definitely broken (see entry [5]):
|
My inclination (and intent with this patch) was to basically follow the numpy subclassing behavior, which is (I think):
This is the same as Pandas Series behavior (noting that Series can't contain multi-dim elements, but I think that single item access would return an ndarray version of the element). EDIT: by "ndarray version of that element", I meant a numpy scalar or array as relevant. |
This PR makes Column consistent with Row (as of 0.4):
|
So in terms of precedence from Numpy itself and Pandas, my vote is to mostly leave Column item access the same but put in this patch to make multi-dim elements behave like scalars. This is independent of what happens when Columns are involved in arithmetic operations or whatnot in the |
@mhvk @embray @astrofrog - any thoughts on this? Here are my thoughts on the current options that have been discussed (along with comment on precedent from other packages):
|
@taldcroft - I think I like 3 best. So just to be clear, this means that accessing a single 'cell' in the table will return a Numpy scalar or array, whereas selecting multiple rows from a column will still return a column. Is this correct? If so, then 👍 from me. |
It would be good to include a test which also includes comments for each test case to explain the behavior we expect in each case. |
7d626ce
to
fd83529
Compare
Added tests. |
@embray @astrofrog - there are 3 Travis fails here, all on Python 3.x at the point where a method docstring in astropy.time tries to import matplotlib. This is unrelated to this PR. Any ideas? I dug around a little but don't understand the test system well enough.
|
That doctest should be skipped if the optional dependencies are not installed. Not sure what changed, and how it works on 2.7 though... |
Exactly. |
@taldcroft - OK with option 3, at least for now. We probably will have to revisit a little when we do the |
I restarted one of the failing test cases and it passed this time, so I've restarted the other two as well. Hopefully it'll work 🙏. |
@taldcroft I've seen that error before--it seems to happen non-deterministically. I really want to get to the bottom of it at some point because it is annoying an bizarre, but incredibly difficult to debug. As @astrofrog points out, that test should always be skipped. |
AH, I think I figured it out. See #3169. |
…ormat classes to the TIME_FORMAT and TIME_DELTA_FORMAT dicts at class creation time instead of after the fact. This should also reduce the effort required to define new formats external to Astropy. This may understandably look overengineered, but it does have the advantage of fixing the annoyance mentioned here: astropy#3095 (comment) by not doing any module-level hackery.
Tests passing and now ready for final review. |
👍 |
Override column __getitem__ to return int item as ndarray not column
…ormat classes to the TIME_FORMAT and TIME_DELTA_FORMAT dicts at class creation time instead of after the fact. This should also reduce the effort required to define new formats external to Astropy. This may understandably look overengineered, but it does have the advantage of fixing the annoyance mentioned here: astropy#3095 (comment) by not doing any module-level hackery.
This was pulled from #2790 and is pending further discussion of when table columns should be dropped to ndarray.