8000 Towards "pandas 1.0" · Issue #10000 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

Towards "pandas 1.0" #10000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Apr 27, 2015 · 41 comments
Closed

Towards "pandas 1.0" #10000

jorisvandenbossche opened this issue Apr 27, 2015 · 41 comments

Comments

@jorisvandenbossche
Copy link
Member

Here's our roadmap document: https://docs.google.com/document/d/151ct8jcZWwh7XStptjbLsda6h2b3C0IuiH_hfZnUA58/edit#

Just because it is a nice round number :-)

Or maybe we can use it to discuss how we imagine a possible pandas 1.0 ..


Some clarification (from @shoyer): This is not the place to make new feature requests -- please continue to make separate GitHub issues for those. Almost every new feature can be added without a 1.0 release. If there is a change you think would be necessary to do in pandas 1.0, feel free to reference issues where it is described in more detail.

@shoyer
Copy link
Member
shoyer commented Apr 27, 2015

My wish list for pandas 1.0:

  1. Fix []/__getitem__ (Overview of [] (__getitem__) API #9595)
  2. Make the index/column distinction less painful (ENH/API: clarify groupby by to handle columns/index names #5677, Allowing the index to be referenced by name, like a column #8162)

I also have a fantasy world where the pandas Index becomes entirely optional, but that might be too big of a break even for pandas 1.0.

@jorisvandenbossche
Copy link
Member Author

I want to add:

  1. Clean up the Index vs MultiIndex API (Unify index and multindex (and possibly others) API #3268)

@jnmclarty
Copy link
Contributor

What if, every pnl, df, s, had a mode, that changed the slicing/getitem behavior. One could set the default in the options, and change it on a per-object basis when necessary? It could allow old-new to transition smoother, plus, get more creative where desired.

@shoyer
Copy link
Member
shoyer commented Apr 27, 2015

@jnmclarty A better option would be some sort of flag that could be set per module, similar to a future statement -- changing the way in which a specific DataFrame is queried is just begging for someone to pass it off to an incompatible function. In fact, I just asked if this is possible on StackOverflow: http://stackoverflow.com/questions/29905278/using-future-style-imports-for-module-specific-features-in-python/

@djchou
Copy link
djchou commented Apr 30, 2015

It would be nice if there was an option to have boxplot X axis labels match line plot's X axis labels.

@shoyer
Copy link
Member
shoyer commented Apr 30, 2015

@djchou is there an existing issue for that? If not, please make one :).

@sinhrks
Copy link
Member
sinhrks commented May 1, 2015

Congrats on the great package:D

My wish is:

@datnamer
Copy link
datnamer commented May 1, 2015

dplyr like macros: https://github.com/dalejung/naginpy

A guy can wish...

@TomAugspurger
Copy link
Contributor

I've been working on problems recently where having groupbys run in parallel would have been great (I think). Also maps / applys.

@jorisvandenbossche jorisvandenbossche changed the title Our 10,000th issue! Towards "pandas 1.0" May 29, 2015
@lexual
Copy link
Contributor
lexual commented Jun 6, 2015

ref #1907

@jorisvandenbossche jorisvandenbossche added this to the 1.0 milestone Jun 8, 2015
@toddrjen
Copy link
Contributor
toddrjen commented Jun 9, 2015

These may be too small, but since this is a wishlist I would like to see some improvements in the consistency of the API. Some example:

  • More consistent usage of singular vs. plural, for example index/indexes, column/columns, and level/levels. This includes both the names and whether they accept single values, multiple values, or both.
  • Make sure the axis argument is available wherever operations are applied across along an axis.
  • Go through related functions and make sure they have the same arguments in the same order. For example, for DataFrame, cumsum has a skip_na argument, while diff doesn't.
  • If an argument does the same thing as a method, it should have the same name as the method. So for example fill_value should be fillna.
  • Try to get the use of underscores more consistent. For example, in DataFrame we have sort_index and sortlevel, and is_copy, isin, and isnull.

@shoyer
Copy link
Member
shoyer commented Jun 9, 2015

For the record, I'm strongly -1 on @toddrjen's suggestion to rename methods to make the use of underscores more consistent. Even Python 3 didn't clean things up like that.