8000 PDEP-10: Add pyarrow as a required dependency by mroeschke · Pull Request #52711 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

PDEP-10: Add pyarrow as a required dependency #52711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Jul 30, 2023
Merged
Changes from 1 commit
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
89a3a3b
Start pdep 10
mroeschke Apr 14, 2023
cf88b43
Merge remote-tracking branch 'upstream/main' into pdep/pyarrow
mroeschke Apr 17, 2023
dafa709
finish drawbacks, fix other sections
mroeschke Apr 17, 2023
5e1fbd1
Add number
mroeschke Apr 17, 2023
44a3321
our current version is 7 not 6
mroeschke Apr 17, 2023
ea9f5e3
Merge remote-tracking branch 'upstream/main' into pdep/pyarrow
mroeschke Apr 18, 2023
fbd1aa0
Clarify and fix typo
mroeschke Apr 18, 2023
6d667b4
Update web/pandas/pdeps/0010-required-pyarrow-dependency.md
phofl Apr 21, 2023
bed5f0b
Update web/pandas/pdeps/0010-required-pyarrow-dependency.md
phofl Apr 21, 2023
12622bb
Update web/pandas/pdeps/0010-required-pyarrow-dependency.md
phofl Apr 21, 2023
864b8d1
Add string as a preferential pyarrow type
mroeschke Apr 21, 2023
2d4f4fd
Add metric about number of pyarrow import checks
mroeschke Apr 21, 2023
bb332ca
Clarify with actual call
mroeschke Apr 21, 2023
a8275fa
Clarify with actual call
mroeschke Apr 21, 2023
1148007
Merge remote-tracking branch 'upstream/main' into pdep/pyarrow
mroeschke Apr 28, 2023
b406dc1
Address some comments
mroeschke Apr 28, 2023
ecc4d5b
Update 0010-required-pyarrow-dependency.md
phofl Apr 28, 2023
ec1c0e3
Update 0010-required-pyarrow-dependency.md
phofl Apr 28, 2023
23eb251
add Patrick as an author, remove constraint on only bumping during ma…
mroeschke Apr 28, 2023
dd7c62a
Merge remote-tracking branch 'upstream/main' into pdep/pyarrow
mroeschke May 9, 2023
2ddd82a
Change required proposal for 3.0 to be version requiring pyarrow & st…
mroeschke May 9, 2023
3c54d22
Merge remote-tracking branch 'upstream/main' into pdep/pyarrow
mroeschke May 9, 2023
1b60fbb
Address typos
mroeschke May 9, 2023
70cdf74
Merge branch 'main' into pdep/pyarrow
mroeschke May 24, 2023
14602a6
Merge branch 'main' into pdep/pyarrow
mroeschke Jun 1, 2023
2cfb92f
Merge branch 'main' into pdep/pyarrow
mroeschke Jun 9, 2023
e0e406c
Merge branch 'main' into pdep/pyarrow
mroeschke Jun 20, 2023
f047032
Update 0010-required-pyarrow-dependency.md
phofl Jul 2, 2023
ed28c04
Update web/pandas/pdeps/0010-required-pyarrow-dependency.md
phofl Jul 3, 2023
99de932
Update 0010-required-pyarrow-dependency.md
phofl Jul 4, 2023
99fd739
Update 0010-required-pyarrow-dependency.md
phofl Jul 4, 2023
9384bc7
Update 0010-required-pyarrow-dependency.md
phofl Jul 4, 2023
c3beeb3
Update 0010-required-pyarrow-dependency.md
phofl Jul 4, 2023
8347e83
improve structure, list user benefits more clearly, add faq
MarcoGorelli Jul 5, 2023
d740403
restore little demo
MarcoGorelli Jul 5, 2023
959873e
remove masked part, note that pyarrow dtyeps will likely be ready by 3
MarcoGorelli Jul 5, 2023
f936280
Merge pull request #26 from MarcoGorelli/pdep10-amendments
mroeschke Jul 6, 2023
2db0037
Update 0010-required-pyarrow-dependency.md
phofl Jul 13, 2023
c2b8cfe
Merge branch 'main' into pdep/pyarrow
mroeschke Jul 25, 2023
4e05151
Update 0010-required-pyarrow-dependency.md
phofl Jul 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update 0010-required-pyarrow-dependency.md
  • Loading branch information
phofl authored Jul 2, 2023
commit f047032598bbdb2c84acc4d51e70bde0643ea1a8
7 changes: 5 additions & 2 deletions web/pandas/pdeps/0010-required-pyarrow-dependency.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,11 @@ This PDEP proposes that:
- The minimum version of PyArrow supported starting with pandas 3.0 is version 7 of PyArrow.
- When the minimum version of PyArrow is bumped, PyArrow will be bumped to the highest version that has
been released for at least 2 years.
- Starting in pandas 2.1, pandas raises a ``FutureWarning`` when needing to infer string data that the future
data type result will be `ArrowDtype` with `pyarrow.string` instead of object
- The pandas 2.1 release notes will have a big warning that PyArrow will become a required dependency starting
with pandas 3.0.
- Starting in pandas 2.2, pandas raises a ``FutureWarning`` when PyArrow is not installed in the users
environment when pandas is imported. This will ensure that only one warning is raised and users can
easily silence it if necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall some discussion we had on this PDEP about having the warning point to a GitHub issue where we could collect feedback on this requirement. If we agree on this concept, I think it should be mentioned here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind adding that the warning will point to the feedback issue?

Copy link
@wirable23 wirable23 Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the naive question, but I thought the purpose of this PDEP was to collect feedback on pyarrow as a required dependency? I understand visibility may be more if there's a link to a new issue in a warning in a future version of pandas, but to me, doing it that way (to collect feedback all over again) just seems like we're going to be having this discussion again in 6 months and will end up kicking kicking the can down the road.

I think that issue would only get traction from people who strongly don't want pyarrow. There could be millions of users happy or neutral with the requirement, and we'd only see the 10 people unhappy enough to voice their concerns.

- Starting in pandas 3.0, the default type inferred for string data will be `ArrowDtype` with `pyarrow.string`
instead of `object`

Expand Down
0