10000 Added pathlib support for several functions by wackywendell · Pull Request #6660 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

Added pathlib support for several functions #6660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 7, 2016

Conversation

wackywendell
Copy link
Contributor

Added support for pathlib.Path support for several functions, including savetxt, loadtxt, load, savez, savez_compressed, ndfromtxt, recfromtxt, and recfromcsv, along with testing. Fixes #6418.

Note that #6655 also addresses the same issue, and we should coordinate on this; I was a bit slow on making my changes, and @r0fls did some good work on his branch!

This could also use a bit of code review, I imagine.

CC @rgommers, @r0fls.

@wackywendell wackywendell force-pushed the usepath branch 2 times, most recently from a117f3a to 83c0513 Compare November 10, 2015 02:41
@@ -425,7 +429,7 @@ def save(file, arr, allow_pickle=True, fix_imports=True):

Parameters
----------
file : file or str
file : file, str, or pathlib.Path
File or filename to which the data is saved. If file is a file-object,
then the filename is unchanged. If file is a string, a ``.npy``
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could update this to say If file is a string or a pathlib.Path object, a ''.npy'' extension will be appended to the file name if it does not already...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and done!

@@ -86,10 +86,12 @@ def __dir__(self):
return object.__getattribute__(self, '_obj').keys()


def zipfile_factory(*args, **kwargs):
def zipfile_factory(file, mode='r', *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, adding a positional argument seems like a potential compatibility problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how in this case it would be a problem. Could you be more specific?

Before, it was just passing *args to zipfile.Zipfile(); the first two args to zipfile.Zipfile() are necessarily file and mode, and AFAICT always have been. I'm not sure how this could be a backwards-compatibility problem.

Or are you thinking this might be a future compatibility problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By and large, signatures should not be changed without good reason, there is always the possibility that you have missed something nor can you rely on the tests being good enough to catch problems. This is particularly true for public functions, which unfortunately this is. The original was perhaps unfortunate, but there you go. It is best to keep PRs focused. If you feel this change is a useful cleanup, then it should come in as a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify the procedure ;) A new private function, say _zipfile_factory would be defined, and zipfile_factory would be deprecated. That may seem overly picky, but all such changes carry some risk, and when several hundred PRs are present in a release the risk accumulates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that makes sense. I'll work on it - probably by changing it back, because I don't think this is important enough to warrant a deprecation cycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted the type signature, although I did add an assert. With the previous version, a 0-length *args would cause an error from zipfile.Zipfile; this just changes the exception that would be raised from a TypeError to an AssertionError. Is that ok, or should I also preserve the assertion type?

@charris
Copy link
Member
charris commented Nov 15, 2015

See also #6655, which does much the same.

@wackywendell
Copy link
Contributor Author

See also #6655, which does much the same.

Yes, #6655 adds pathlib.Path support to one function; this adds it to about 7. I could rebase this onto #6655, though, if you'd like... I was waiting to see if #6655 would be merged first, though.

@charris
Copy link
Member
charris commented Nov 15, 2015

The PR list tends to act like a lifo :) I just saw #6655 as I was going down the list.

@@ -87,6 +87,13 @@ def __dir__(self):


def zipfile_factory(*args, **kwargs):
"""Create a zipfile.Zipfile. Allows for Zip64, as well as pathlib.Path
objects."""
assert len(args) >= 1, "zipfile_factory requires a file"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can keep the file argument, it isn't optional in the ZipFile constructor and you would get an error anyway without it.

class zipfile.ZipFile(file[, mode[, compression[, allowZip64]]])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that assert doesn't work when python is run in optimized mode. Just raise an error if you want to do it that way. Personally, I think a TypeError from ZipFile is fine, especially as it is unlikely to ever be seen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, now its zipfile_factory(file, *args, **kwargs), and the code is cleaner.

@@ -476,6 +482,11 @@ def save(file, arr, allow_pickle=True, fix_imports=True):
file = file + '.npy'
fid = open(file, "wb")
own_fid = True
elif _is_pathlib_path(file):
if not file.name.endswith('.npy'):
file = file.parent / (file.name + '.npy')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what does the / do here. I haven't seen that before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, looked it up.

@charris
Copy link
Member
charris commented Nov 16, 2015

I think we are almost there. Could you squash the commits and take a look at the commit message guidelines in doc/source/dev/gitwash/development_workflow.rst. Not sure what to do about the context manager for temporary files, as mentioned there is another PR with one. I'd like to have one available somewhere global, maybe in numpy/testing/utils.py

@wackywendell
Copy link
Contributor Author

I think we are almost there. Could you squash the commits and take a look at the commit message guidelines

Thanks! Once we figure out the context manager thing, I'll squash / rebase / fix the commit messages.

@rherault-pro
Copy link

+++ to @charris we definitively need context managers for temp dirs and temp files that can be closed without removed in a larger context that tests.

@rherault-pro
Copy link

Moreover having functions accepting file descriptor and not only string or file path is a good move in the fact that they will be able to transparently accept in mem (temporary) files.

@charris
Copy link
Member
charris commented Dec 20, 2015

I've committed a temppath context manager in #6866. If you look there, I also refactor a test to use it. You can get at it (and tempdir) by rebasing on master and importing from numpy.testing.

@wackywendell
Copy link
Contributor Author

Great! I'll look into that.

@charris
Copy link
Member
charris commented Jan 12, 2016

@wackywendell Could you finish this up and squash the commits into one with a rewritten commit message ala doc/source/dev/gitwash/development_workflow.rst?

@wackywendell
Copy link
Contributor Author

Hi, sorry - I've been a bit side-tracked in the past month or two; I have a PhD defense in a month, and I'm going through interview stuff now, so this is taking me longer than I planned. I'll try and get it done as soon as I can, though.

@njsmith
Copy link
Member
njsmith commented Apr 6, 2016

Only concern is that it sounds like python-dev is converging on a nicer way to do this that will probably be released in the next 3.4, 3.5 point releases. Maybe we don't care and can always clean this up later once that happens -- I don't have a strong opinion either way.

@njsmith
Copy link
Member
njsmith commented Apr 6, 2016

Probably we should just go for it given that the PR exists :-). I won't merge myself though b/c I'm just thinking about the high-level issue, I haven't actually read the code.

@@ -10,6 +10,10 @@
'integer_types']

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Path and is_pathlib_path to the __all__ list, then you can import from numpy.compat.

@charris
Copy link
Member
charris commented Apr 6, 2016

I think we are getting close. @njsmith Should be easy to search and fixup. What changes are being proposed?

EDIT: Is pathlib going away?

@njsmith
Copy link
Member
njsmith commented Apr 6, 2016

@charris: it looks like the current trend is towards adding a standard convert-to-filesystem-path protocol sorta along the lines of __index__, so APIs that take paths and want strings would do from operators import fspath and then

def some_file_function(path, ...):
    # force path to be a str
    path = fspath(path)
    ...

Or something like that.

Also probably pathlib support will get added to a bunch of builtins like open, so that for most code no special handling will be needed at all -- you just write

def some_file_function(path, ...):
    with open(path) as f:
        ...

and it just magically works for both str and Path arguments.

@njsmith
Copy link
Member
njsmith commented Apr 6, 2016

But, it's a mailing list discussion, so -- that's where things seem to be going right now, but, check back next week for our thrilling conclusion :-).


class TestPathUsage(TestCase):
# Test that pathlib.Path can be used
@np.testing.dec.skipif(Path is None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, this will now fit on a single line. Same below.

@charris
Copy link
Member
charris commented Apr 7, 2016

Looks good modulo two style nits.

@charris charris merged commit 0c6aa60 into numpy:master Apr 7, 2016
@charris
Copy link
Member
charris commented Apr 7, 2016

Thanks @wackywendell .

@ChrisBarker-NOAA
Copy link
Contributor

Sorry to not look at the code -- but is this going to work in py2 with the pypi pathlib2 object? If so -- great!

@charris
Copy link
Member
charris commented Apr 8, 2016

No, it won't work with pathlib2 because the module name differs. @wackywendell That could probably be fixed.

@ChrisBarker-NOAA
Copy link
Contributor

well, there is a new protocol being proposed -- something like a fspath dunder method. So maybe just wait 'till that gets finalized.

@charris
Copy link
Member
charris commented Apr 8, 2016

@ChrisBarker-NOAA Probably a good idea. There is also tofile and fromfile to worry about, but they are at the C level and so it will be worth waiting to see what the final interface is.

@eric-wieser
Copy link
Member

That protocol has been finalized in PEP 519, and is live in python 3.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants
0