-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Added pathlib support for several functions #6660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a117f3a
to
83c0513
Compare
@@ -425,7 +429,7 @@ def save(file, arr, allow_pickle=True, fix_imports=True): | |||
|
|||
Parameters | |||
---------- | |||
file : file or str | |||
file : file, str, or pathlib.Path | |||
File or filename to which the data is saved. If file is a file-object, | |||
then the filename is unchanged. If file is a string, a ``.npy`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could update this to say If file is a string or a pathlib.Path object, a ''.npy'' extension will be appended to the file name if it does not already...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, and done!
83c0513
to
b7ea047
Compare
@@ -86,10 +86,12 @@ def __dir__(self): | |||
return object.__getattribute__(self, '_obj').keys() | |||
|
|||
|
|||
def zipfile_factory(*args, **kwargs): | |||
def zipfile_factory(file, mode='r', *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, adding a positional argument seems like a potential compatibility problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how in this case it would be a problem. Could you be more specific?
Before, it was just passing *args
to zipfile.Zipfile()
; the first two args to zipfile.Zipfile()
are necessarily file
and mode
, and AFAICT always have been. I'm not sure how this could be a backwards-compatibility problem.
Or are you thinking this might be a future compatibility problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By and large, signatures should not be changed without good reason, there is always the possibility that you have missed something nor can you rely on the tests being good enough to catch problems. This is particularly true for public functions, which unfortunately this is. The original was perhaps unfortunate, but there you go. It is best to keep PRs focused. If you feel this change is a useful cleanup, then it should come in as a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify the procedure ;) A new private function, say _zipfile_factory
would be defined, and zipfile_factory
would be deprecated. That may seem overly picky, but all such changes carry some risk, and when several hundred PRs are present in a release the risk accumulates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that makes sense. I'll work on it - probably by changing it back, because I don't think this is important enough to warrant a deprecation cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted the type signature, although I did add an assert
. With the previous version, a 0-length *args
would cause an error from zipfile.Zipfile
; this just changes the exception that would be raised from a TypeError
to an AssertionError
. Is that ok, or should I also preserve the assertion type?
See also #6655, which does much the same. |
The PR list tends to act like a lifo :) I just saw #6655 as I was going down the list. |
d8da42a
to
0aac04d
Compare
@@ -87,6 +87,13 @@ def __dir__(self): | |||
|
|||
|
|||
def zipfile_factory(*args, **kwargs): | |||
"""Create a zipfile.Zipfile. Allows for Zip64, as well as pathlib.Path | |||
objects.""" | |||
assert len(args) >= 1, "zipfile_factory requires a file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can keep the file argument, it isn't optional in the ZipFile constructor and you would get an error anyway without it.
class zipfile.ZipFile(file[, mode[, compression[, allowZip64]]])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that assert
doesn't work when python is run in optimized mode. Just raise an error if you want to do it that way. Personally, I think a TypeError
from ZipFile
is fine, especially as it is unlikely to ever be seen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, now its zipfile_factory(file, *args, **kwargs)
, and the code is cleaner.
0aac04d
to
7622fa9
Compare
@@ -476,6 +482,11 @@ def save(file, arr, allow_pickle=True, fix_imports=True): | |||
file = file + '.npy' | |||
fid = open(file, "wb") | |||
own_fid = True | |||
elif _is_pathlib_path(file): | |||
if not file.name.endswith('.npy'): | |||
file = file.parent / (file.name + '.npy') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what does the /
do here. I haven't seen that before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, looked it up.
I think we are almost there. Could you squash the commits and take a look at the commit message guidelines in |
Thanks! Once we figure out the context manager thing, I'll squash / rebase / fix the commit messages. |
+++ to @charris we definitively need context managers for temp dirs and temp files that can be closed without removed in a larger context that tests. |
Moreover having functions accepting file descriptor and not only string or file path is a good move in the fact that they will be able to transparently accept in mem (temporary) files. |
I've committed a |
Great! I'll look into that. |
@wackywendell Could you finish this up and squash the commits into one with a rewritten commit message ala |
Hi, sorry - I've been a bit side-tracked in the past month or two; I have a PhD defense in a month, and I'm going through interview stuff now, so this is taking me longer than I planned. I'll try and get it done as soon as I can, though. |
Only concern is that it sounds like python-dev is converging on a nicer way to do this that will probably be released in the next 3.4, 3.5 point releases. Maybe we don't care and can always clean this up later once that happens -- I don't have a strong opinion either way. |
Probably we should just go for it given that the PR exists :-). I won't merge myself though b/c I'm just thinking about the high-level issue, I haven't actually read the code. |
@@ -10,6 +10,10 @@ | |||
'integer_types'] | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add Path
and is_pathlib_path
to the __all__
list, then you can import from numpy.compat
.
I think we are getting close. @njsmith Should be easy to search and fixup. What changes are being proposed? EDIT: Is pathlib going away? |
@charris: it looks like the current trend is towards adding a standard convert-to-filesystem-path protocol sorta along the lines of def some_file_function(path, ...):
# force path to be a str
path = fspath(path)
... Or something like that. Also probably pathlib support will get added to a bunch of builtins like
and it just magically works for both str and Path arguments. |
But, it's a mailing list discussion, so -- that's where things seem to be going right now, but, check back next week for our thrilling conclusion :-). |
|
||
class TestPathUsage(TestCase): | ||
# Test that pathlib.Path can be used | ||
@np.testing.dec.skipif(Path is None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, this will now fit on a single line. Same below.
Looks good modulo two style nits. |
Thanks @wackywendell . |
Sorry to not look at the code -- but is this going to work in py2 with the pypi pathlib2 object? If so -- great! |
No, it won't work with |
well, there is a new protocol being proposed -- something like a fspath dunder method. So maybe just wait 'till that gets finalized. |
@ChrisBarker-NOAA Probably a good idea. There is also |
That protocol has been finalized in PEP 519, and is live in python 3.6 |
Added support for pathlib.Path support for several functions, including
savetxt
,loadtxt
,load
,savez
,savez_compressed
,ndfromtxt
,recfromtxt
, andrecfromcsv
, along with testing. Fixes #6418.Note that #6655 also addresses the same issue, and we should coordinate on this; I was a bit slow on making my changes, and @r0fls did some good work on his branch!
This could also use a bit of code review, I imagine.
CC @rgommers, @r0fls.