8000 ENH: Improve support for pathlib.Path objects in load functions by paulmueller · Pull Request #11348 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Improve support for pathlib.Path objects in load functions #11348

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 1, 2018

Conversation

paulmueller
Copy link
Contributor

Add support for pathlib.Path when "mmap_mode" is specified
in np.load "Closes #11342".
Add support for pathlib.Path in np.core.records.fromfile.

Add support for pathlib.Path when "mmap_mode" is specified
in np.load "Closes numpy#11342".
Add support for pathlib.Path in np.core.records.fromfile.
@eric-wieser
Copy link
Member
eric-wieser commented Jun 16, 2018

Worth noting that this only matters on python < 3.6

edit: Not true, apologies. Would be good to use os.fspath or a backport going forward though

@paulmueller
Copy link
Contributor Author

@eric-wieser
I have very bad experience with converting pathlib.Path to str consistently across all OSes and Python versions due to UnicodeErrors. Adding a backport of os.fspath might not be such a good idea. I think the "Path.open" functionality of pathlib should be used whenever possible.

@@ -767,6 +767,9 @@ def fromfile(fd, dtype=None, shape=None, offset=0, formats=None,
if isinstance(fd, str):
name = 1
fd = open(fd, 'rb')
elif is_pathlib_path(fd):
8000 name = 1
fd = fd.open('rb')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious, what happens when fd is a unicode string on python2?

@mattip
Copy link
Member
mattip commented Aug 8, 2018

Not sure how using os.fspath would play out when we accept str, pathlib.Path and file for fd. Something like this?

if sys.version_info < (3, 6):
    # code in the current PR
else:
    if <some condition>:
        fd = open(os.fspath(fd), mode)

Backporting os.fspath sounds like a maintenance burden, we would need to update it whenever python fixes bugs.

On another topic, we don't accept unicode (pre-python3) or bytes (post-python3)?

@paulmueller
Copy link
Contributor Author
paulmueller commented Aug 8, 2018

Yes, it seems like this is the current situation. I would suggest using numpy.compat (instead of checking sys.version), e.g.

from numpy.compat import isfileobj, is_pathlib_path

if isfileobj(fd):
   pass  # the file is already open
elif is_pathlib_path(fd):
    # new pathlib code
else:
    # previous code

If there are any unicode errors caused by previous code, then at least numpy is not responsible for resolving them.

[EDIT]: This also applies to the memmap code https://github.com/numpy/numpy/pull/11348/files/c4e5aa10ab630a26019af621ee77d532c99cb076#diff-177e218db0a99e24a32dbb379623d744R755.

@mattip
Copy link
Member
mattip commented Oct 12, 2018

There are conflicts and the release note change is for 1.15, not 1.16

@mattip mattip added component: numpy._core 55 - Needs work 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes labels Oct 12, 2018
@paulmueller
Copy link
Contributor Author

@mattip Could you please give me your opinion on my suggestion in my previous comment? I will have a look at the conflicts once it is clear how to proceed.

@mattip
Copy link
Member
mattip commented Oct 12, 2018

Your suggestion looks good to me. Please go ahead with it.

@eric-wieser
Copy link
Member
eric-wieser commented Oct 12, 2018

Backporting os.fspath sounds like a maintenance burden

The implementation of fspath is pretty straightfward: https://github.com/python/cpython/blob/3.7/Lib/os.py#L1031-L1058. I doubt it will have any bugs that need fixing.

Cutting out this comment in place of a PR at #12157

@paulmueller
Copy link
Contributor Author

I suppose I will wait for #12157 then. At least the test functions might be useful.

@@ -2268,6 +2268,7 @@ def test_save_load_memmap(self):
np.save(path, a)
data = np.load(path, mmap_mode='r')
assert_array_equal(data, a)
del data
Copy link
Member
8000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for this line? A comment might be nice

Copy link
Contributor Author
@paulmueller paulmueller Oct 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required such that numpy closes the mem-mapped file. Otherwise you will get weird errors on Windows.

@mattip
Copy link
Member
mattip commented Oct 31, 2018

#12157 has been merged

Add support for pathlib.Path when "mmap_mode" is specified
in np.load "Closes numpy#11342".
Add support for pathlib.Path in np.core.records.fromfile.
@paulmueller
Copy link
Contributor Author

It seems like it screwed up and used "pull" instead of "rebase" somewhere along the way of getting the changes from #12157 into this feature branch.
Unless anyone knows how to resolve this mess, I will create a new feature branch and copy-paste the tests and the other changes into a new pull request. Sorry.

@eric-wieser eric-wieser changed the base branch from master to maintenance/1.0.3.x October 31, 2018 23:25
@eric-wieser eric-wieser changed the base branch from maintenance/1.0.3.x to master October 31, 2018 23:26
@eric-wieser
Copy link
Member

Looks like that did the trick. I guess that's a github bug?

if is_pathlib_path(filename):
fp = filename.open(mode+'b')
else:
fp = open(filename, mode+'b')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better as

from numpy.compat import os_fspath

fp = open(os_fspath(filename), mode+'b')

Since this handles PurePath and third-party pathlike objects

if is_pathlib_path(filename):
fp = filename.open('rb')
else:
fp = open(filename, 'rb')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here

@paulmueller
Copy link
Contributor Author

I think I covered everything. I am not sure though, if I edited the release notes correctly (please check). And I don't know why

numpy.core.tests.test_records.TestPathUsage.test_tofile_fromfile

fails on Shippable/Python2.7 (it works on my Ubuntu machine).

@mattip
Copy link
Member
mattip commented Nov 1, 2018

The release note should be on the upcoming 1.16.0-notes.rst, not the already released 1.15.0-notes.rst

-->

@paulmueller
Copy link
Contributor Author

OK, got it. Checks also seem to pass. Ready to merge. I would suggest a squash-merge.

@mattip mattip merged commit cc761fe into numpy:master Nov 1, 2018
@mattip
Copy link
Member
mattip commented Nov 1, 2018

Thanks @paulmueller , @eric-wieser

@mattip
Copy link
Member
mattip commented Nov 1, 2018

I misused the squash merge feature of github so the message got doubled.

E097 a[5] = (0.5,10,'abcde')
a.newbyteorder('<')
with path.open("wb") as fd:
a.tofile(fd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ndarray.tofile() discards information on endianness and most recent ARM features switchable endianness (bi-endian)

I wonder if this is contributing to #12330.

Certainly this test has just started failing stochastically on shippable / ARM for 2.7 / 3.7, so it may not have been stable in the first place given how recently this was merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't the problem -- I'm preparing the patch now in the linked PR, which is related to testing precision and np.empty() as far as I can tell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
55 - Needs work 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes component: numpy._core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0