8000 Handle `ValueError` while reading configuration files by `configparser` · Issue #122454 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

Handle ValueError while reading configuration files by configparser #122454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
picnixz opened this issue Jul 30, 2024 · 12 comments
Closed

Handle ValueError while reading configuration files by configparser #122454

picnixz opened this issue Jul 30, 2024 · 12 comments

Comments

@picnixz
Copy link
Member
picnixz commented Jul 30, 2024

Bug report

Bug description:

In configparser.RawConfigParser.read, when open() raises OSError, the file is skipped. Note that the OSError guard also includes the parsing phase and not just the call to open().

However, open() can raise a ValueError in the following cases:

  • On Linux: The file has embedded NUL bytes.
  • On Linux: The file has surrogate codes. On Windows, this raises an OSError instead.

Note that there is a test for checking when a file cannot be read assuming it has been opened. In particular, to retain compatibility, we should only skip the file if we cannot open it in the first place. Any issue during the parsing (such as an encoding or decoding error) should be propagated (but an OSError during the parsing should be suppressed as it is currently the case).

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Linked PRs

@picnixz picnixz added the type-bug An unexpected behavior, bug, or error label Jul 30, 2024
@picnixz picnixz added 3.12 only security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes labels Jul 30, 2024
@jaraco
Copy link
Member
jaraco commented Jul 30, 2024

The body of the bug description doesn't describe the bug. I had to read through the diff of the PR to understand what the problem is. Can you describe what behavior you encountered under what conditions and what behavior you expected instead?

@picnixz
Copy link
Member Author
picnixz commented Jul 30, 2024

Oups, sorry! I'm actually trying to fix the different calls to open() or os.stat() like functions that can raise a ValueError due to incorrect filenames. I did that for a bunch of modules (linecache, filecmp) after we found #122170.

I personally did not use the module so I was just looking which modules were missing possible ValueError. I think that the description of read() is misleading since it says:

Files that cannot be opened are silently ignored; this is
designed so that you can specify an iterable of potential
configuration file locations (e.g. current directory, user's
home directory, systemwide directory), and all existing
configuration files in the iterable will be read. A single
filename may also be given.

and this is done by suppressing OSError exceptions (in general, that's the error you have when you cannot open a file). Nevertheless, you can also have a ValueError when the filename has embedded NUL bytes, surrogate codes or if the name is just too long, and those files should be considered as "cannot be opened".

@jaraco
Copy link
Member
jaraco commented Jul 30, 2024

Thanks for the explanation.

Now that I think about it, I'm more inclined to think configparser shouldn't be changed.

As you point out, the user can supply invalid filenames, but I wouldn't consider that "files that cannot be opened" in the same way that a missing file or inaccessible file cannot be opened.

That is, an invalid filename is not valid for a "potential configuration file location", because it will never be a viable configuration file location.

Propagating the ValueError in this case is just fine, and preferable, IMO.

Can you think of a scenario where a user would be supplying an invalid filename and would want that error suppressed?

@picnixz
Copy link
Member Author
picnixz commented Jul 30, 2024

Can you think of a scenario where a user would be supplying an invalid filename and would want that error suppressed?

When you supply a too long path on Linux, you'll get an OSError. If you do it on Windows, I think you get a ValueError due to the AC path_t converter which raises a ValueError instead of an OSError at the early stage of the conversion (I know that it's the case for os.stat but I don't know whether it's also the case for open which is _io.open; I think so). EDIT: not the case, both raise OSError (see #122454 (comment)).

So, on Linux, a long path would be skipped but on Windows you would have an error. This behaviour is inconsistent IMO =/

@picnixz
Copy link
Member Author
picnixz commented Jul 30, 2024

Actually, let me just check with the CI/CD. I'll try to see what happens.

@jaraco
Copy link
Member
jaraco commented Jul 30, 2024

Can you think of a scenario where a user would be supplying an invalid filename and would want that error suppressed?

When you supply a too long path on Linux, you'll get an OSError. If you do it on Windows, I think you get a ValueError due to the AC path_t converter which raises a ValueError instead of an OSError at the early stage of the conversion (I know that it's the case for os.stat but I don't know whether it's also the case for open which is _io.open; I think so).

So, on Linux, a long path would be skipped but on Windows you would have an error. This behaviour is inconsistent IMO =/

I confirmed on macOS, a long filename also results in an OSError.

On my Windows system, I get an OSError: [Errno 63] File name too long for files longer than 255.

On windows, I get FileNotFoundError for the surrogate escapes.

That sounds not like a problem with configparser but a problem with open. Maybe the bug title should be "open produces different exceptions based on platform for invalid filenames". It seems to me the solution to such a problems shouldn't be "change every downstream consumer to treat ValueError like OSError", which this bug and the others you mention begins to implement. Maybe the solution should be to have open on Windows raise an OSError if an invalid filename is supplied, providing consistent behavior across all platforms and all callers.

But short of that, if the user is supplying an invalid filename for any given platform, it seems that the preferable behavior would be to raise an exception.

I still don't think I have an answer to the question - do we want configparser to treat invalid filenames as valid but missing? Are there cases where, for example, a user is specifying a very long filename, and that name will be valid on one platform, but not another, and we want that code to fail silently on the platforms where it's invalid? I'm thinking no. In that case, the caller should be adding that protection. In the case of configparser, the error suppression is there for missing or inaccessible files and not invalid ones.

@picnixz
Copy link
Member Author
picnixz commented Jul 30, 2024

Maybe the bug title should be "open produces different exceptions based on platform for invalid filenames".

In #122353 (comment), it was suggested to protect against ValueError for functions that use os calls. But I think I incorrectly assumed that open would actually raise OSError and not ValueError. But according to your experiments (and the CI), only OSError are actually raised by open (so my bad!)

It remains to decide how to handle invalid file names in general, but I think it's better to have a different issue (I'll close the PR as well, sorry for the time I took!)

@picnixz
Copy link
Member Author
picnixz commented Jul 30, 2024

Er... actually... open() does raise a ValueError but on Linux if you pass surrogate escape codes... So the problem is the other way around, invalid file names (except those that are too long) would raise a ValueError on Linux but an OSError on Windows.

For long names, the behaviour is the same on Linux and Windows (namely an OSError).

@jaraco
Copy link
Member
jaraco commented Jul 31, 2024

I'm still of the opinion that the current behavior is perfectly acceptable, that silently suppressing OSError and not suppressing ValueError is the best design for configparser and that it's the caller's responsibility to supply valid values for filenames. It seems to me we're not aware of a single instance where the current behavior fails to meet a user's expectations.

I think it makes sense to explore the possibility that "open producing different exceptions for the same input based on platform" should be considered as its own issue, but I'm strongly opposed to striving to alter all downstream users of open to treat ValueError like an OSError. And in my opinion, the distinction is useful. The former means the inputs were invalid (and would never work) whereas the latter is indicative of a runtime failure.

My recommendation, if you feel otherwise, is to either provide a compelling case that would shift my position on the matter, or rally support from another core dev to your understanding.

@picnixz
Copy link
Member Author
picnixz commented Aug 1, 2024

I'm still of the opinion that the current behavior is perfectly acceptable

Yes, after a nmore careful thinking, I also think it's the best course. I incorrectly assumed that open() behaves as os.stat() in terms of errors but that's not the case. Note that this is actually the only place where I decided to patch it due to the open call but I shouldn't have since it's not an os.* call per-se.

Closing as wontfix.

@picnixz picnixz closed this as not planned Won't fix, can't repro, duplicate, stale Aug 1, 2024
@picnixz picnixz removed type-bug An unexpected behavior, bug, or error 3.12 only security fixes 3.13 bugs and security fixes 3.14 bugs and security fixes labels Aug 1, 2024
@jaraco
Copy link
Member
jaraco commented Aug 11, 2024

Thanks picnixz for the careful consideration. I know it's not fun to put a lot of effort into a contribution only not to make any change, so thanks for being cordial and amenable.

@picnixz
Copy link
Member Author
picnixz commented Aug 12, 2024

I know it's not fun to put a lot of effort into a contribution only not to make any change,

Well, I work in academia: sometimes papers got accepted, sometimes not... So I'm not really bothered by the fact that I needed to put effort into something that is not accepted and I also understand the reasons (in this specific case, I personally feel that I was too hasty in making the change so it's not your fault at all!). I think I also learned things by doing this so it was beneficial (and thank you for pointing out the issues)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0