8000 datetime: pure Python implementation of `fromisoformat()` handles times with trailing spaces inconsistently with the C extension · Issue #130959 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

datetime: pure Python implementation of fromisoformat() handles times with trailing spaces inconsistently with the C extension #130959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mgorny opened this issue Mar 7, 2025 · 4 comments
Labels
extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@mgorny
Copy link
Contributor
mgorny commented Mar 7, 2025

Bug report

Bug description:

Discovered primarily because PyPy uses the Python version of datetime from the stdlib, and django.utils.dateparse.parse_datetime() supports wider range of values than the C version of datetime.datetime.fromisoformat(), and falls back to their own parser.

>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400")
datetime.datetime(2012, 4, 23, 10, 20, 30, 400000)
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400 ")
datetime.datetime(2012, 4, 23, 10, 20, 30, 40000)
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400  ")
datetime.datetime(2012, 4, 23, 10, 20, 30, 4000)
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400   ")
datetime.datetime(2012, 4, 23, 10, 20, 30, 400)
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400    ")
Traceback (most recent call last):
  File "<python-input-9>", line 1, in <module>
    datetime.datetime.fromisoformat("2012-04-23T10:20:30.400    ")
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mgorny/git/cpython/Lib/_pydatetime.py", line 1949, in fromisoformat
    raise ValueError(
        f'Invalid isoformat string: {date_string!r}') from None
ValueError: Invalid isoformat string: '2012-04-23T10:20:30.400    '
>>> datetime.datetime.fromisoformat("2012-04-23T10:20:30.400 +02:30")
datetime.datetime(2012, 4, 23, 10, 20, 30, 40000, tzinfo=datetime.timezone(datetime.timedelta(seconds=9000)))

For comparison, the C extension rejects all variants containing spaces, causing Django to use its own parser.

PyPy bug report: pypy/pypy#5240

I'm going to try preparing a patch.

CPython versions tested on:

3.11, 3.13, CPython main branch

Operating systems tested on:

Linux

Linked PRs

@mgorny mgorny added the type-bug An unexpected behavior, bug, or error label Mar 7, 2025
@mgorny mgorny changed the title datetime: pure Python implementation of fromisoformat() mishandles times with trailing spaces datetime: pure Python implementation of fromisoformat() handles times with trailing spaces inconsistently Mar 7, 2025
@mgorny mgorny changed the title datetime: pure Python implementation of fromisoformat() handles times with trailing spaces inconsistently datetime: pure Python implementation of fromisoformat() handles times with trailing spaces inconsistently with the C extension Mar 7, 2025
@mgorny
Copy link
Contributor Author
mgorny commented Mar 7, 2025

Hmm, another difference is that when giving 6 digits of microseconds, C implementation accepts them but Python implementation does not:

>>> datetime.datetime.fromisoformat("2021-12-20T05:05:05.600000 +02:30")
datetime.datetime(2021, 12, 20, 5, 5, 5, 600000, tzinfo=datetime.timezone(datetime.timedelta(seconds=9000)))
>>> _pydatetime.datetime.fromisoformat("2021-12-20T05:05:05.600000 +02:30")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mgorny/miniforge3/lib/python3.12/_pydatetime.py", line 1874, in fromisoformat
    raise ValueError(
ValueError: Invalid isoformat string: '2021-12-20T05:05:05.600000 +02:30'

mgorny added a commit to mgorny/cpython that referenced this issue Mar 7, 2025
…isoformat()`

Fix the pure Python implementation of `fromisoformat()` to reject any
non-digit characters, including whitespace, in the fractional part
of time specification.  This makes the behavior consistent with the C
implementation, and prevents incorrect parsing of these fractions
(e.g. `.400 ` would be misinterpreted as `.04`).
@ZeroIntensity ZeroIntensity added stdlib Python modules in the Lib dir extension-modules C modules in the Modules dir labels Mar 8, 2025
@pganssle
Copy link
Member
pganssle commented Mar 8, 2025

Hmm, another difference is that when giving 6 digits of microseconds, C implementation accepts them but Python implementation does not:

I think the spec doesn't allow for whitespace, right? If so, we should do whatever the spec says. It's slightly more likely to break things if we make changes to the C implementation than the Python im 8000 plementation (since the vast majority of people are using the C implementation).

I think we can fix the issue with the Python implementation and backport the changes to 3.12+, since right now it just gives the wrong answer, and it's better to start raising exceptions here. If we reject the version with a space from the C implementation, I think we would want that to target only 3.14+, because that will take something that currently parses and gives the "correct" answer and start raising exceptions. We don't have to go as far as deprecating the behavior, since the documented scope of the function says that this should be rejected, but I don't think on net it serves users to make that kind of change in a bugfix version.

@mgorny
Copy link
Contributor Author
mgorny commented Mar 8, 2025

I think the spec doesn't allow for whitespace, right?

Yes, the standard doesn't permit whitespace, and it's not listed in "exceptions" in the documentation.

vstinner pushed a commit that referenced this issue Mar 11, 2025
…mat()` (#130962)

* gh-130959: Reject whitespace in fractions, in pure Python `fromisoformat()`

Fix the pure Python implementation of `fromisoformat()` to reject any
non-digit characters, including whitespace, in the fractional part
of time specification.  This makes the behavior consistent with the C
implementation, and prevents incorrect parsing of these fractions
(e.g. `.400 ` would be misinterpreted as `.04`).

* Add the news entry

* Use a different example to fix Sphinx lint

* Apply suggestions from code review

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Paul Ganssle <1377457+pganssle@users.noreply.github.com>

* Try fixing `:func:` ref.

---------

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Paul Ganssle <1377457+pganssle@users.noreply.github.com>
vstinner pushed a commit to vstinner/cpython that referenced this issue Mar 11, 2025
…isoformat()` (python#130962)

Fix the pure Python implementation of `fromisoformat()` to reject any
non-digit characters, including whitespace, in the fractional part
of time specification.  This makes the behavior consistent with the C
implementation, and prevents incorrect parsing of these fractions
(e.g. `.400 ` would be misinterpreted as `.04`).

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Paul Ganssle <1377457+pganssle@users.noreply.github.com>
(cherry picked from commit 33494b4)
vstinner added a commit that referenced this issue Mar 11, 2025
…misoformat()` (#130962) (#131076)

gh-130959: Reject whitespace in fractions, in pure Python `fromisoformat()` (#130962)

Fix the pure Python implementation of `fromisoformat()` to reject any
non-digit characters, including whitespace, in the fractional part
of time specification.  This makes the behavior consistent with the C
implementation, and prevents incorrect parsing of these fractions
(e.g. `.400 ` would be misinterpreted as `.04`).

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Paul Ganssle <1377457+pganssle@users.noreply.github.com>
(cherry picked from commit 33494b4)

Co-authored-by: Michał Górny <mgorny@gentoo.org>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Mar 11, 2025
…n `fromisoformat()` (pythonGH-130962) (pythonGH-131076)

pythongh-130959: Reject whitespace in fractions, in pure Python `fromisoformat()` (pythonGH-130962)

Fix the pure Python implementation of `fromisoformat()` to reject any
non-digit characters, including whitespace, in the fractional part
of time specification.  This makes the behavior consistent with the C
implementation, and prevents incorrect parsing of these fractions
(e.g. `.400 ` would be misinterpreted as `.04`).

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Paul Ganssle <1377457+pganssle@users.noreply.github.com>
(cherry picked from commit 33494b4)
(cherry picked from commit 27fd328)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Michał Górny <mgorny@gentoo.org>
vstinner added a commit that referenced this issue Mar 11, 2025
…mat()` (GH-130962) (GH-131076) (#131086)

Fix the pure Python implementation of `fromisoformat()` to reject any
non-digit characters, including whitespace, in the fractional part
of time specification.  This makes the behavior consistent with the C
implementation, and prevents incorrect parsing of these fractions
(e.g. `.400 ` would be misinterpreted as `.04`).

(cherry picked from commit 33494b4)
(cherry picked from commit 27fd328)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Michał Górny <mgorny@gentoo.org>
@StanFromIreland
Copy link
Contributor

Can this be closed? PR and backports have been merged.

@picnixz picnixz closed this as completed Mar 15, 2025
seehwan pushed a commit to seehwan/cpython that referenced this issue Apr 16, 2025
…isoformat()` (python#130962)

* pythongh-130959: Reject whitespace in fractions, in pure Python `fromisoformat()`

Fix the pure Python implementation of `fromisoformat()` to reject any
non-digit characters, including whitespace, in the fractional part
of time specification.  This makes the behavior consistent with the C
implementation, and prevents incorrect parsing of these fractions
(e.g. `.400 ` would be misinterpreted as `.04`).

* Add the news entry

* Use a different example to fix Sphinx lint

* Apply suggestions from code review

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Paul Ganssle <1377457+pganssle@users.noreply.github.com>

* Try fixing `:func:` ref.

---------

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Paul Ganssle <1377457+pganssle@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Archived in project
Development

No branches or pull requests

5 participants
0