-
-
Notifications
You must be signed in to change notification settings - Fork 32k
SyntaxWarning: invalid decimal literal #114524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't think calling this "nonsense" is helping your argument. There are valid reasons with the new parser on why this is done, I'm not sure how difficult it would be to change. cc @pablogsal |
If there is no invalid literal in the code, then reporting an invalid literal is nonsense: [ord(x)>>5for x in h] This is perfectly valid, and there are no invalid literals. And as I mentioned even for the cases I assume it was meant: 2for = 15 It is still nonsense as it is not an invalid decimal literal, but invalid identifier. Another case I can image is: a = 2else But that would not be an invalid decimal identifier either, this would be an unexpected lexical element. But both of these cases already issue SyntaxError, so I also don't understand this warning in the first place: >>> 2for = 15
<stdin>:1: SyntaxWarning: invalid decimal literal
File "<stdin>", line 1
2for = 15
^^^
SyntaxError: invalid syntax
>>> a = 2else
<stdin>:1: SyntaxWarning: invalid decimal literal
File "<stdin>", line 1
a = 2else
^^^^
SyntaxError: invalid syntax I assume this is badly written check for decimal numbers only containing digits |
The issue is also present in different bases: >>> 0x0if True else 1
<stdin>:1: SyntaxWarning: invalid hexadecimal literal
0
>>> 0if True else 1
<stdin>:1: SyntaxWarning: invalid decimal literal
0 >>> [ord("a")>>0o5for x in [1,2,3]]
<stdin>:1: SyntaxWarning: invalid octal literal
[3, 3, 3] |
There are people who think it a bug that CPython allows such nonsense as |
I don't want to come offensive or something, but this makes no sense. I wrote "as way back as C or even longer". It has no relation to C at all, it just a temporal measure.
What I am describing is industry standard and predictable behavior. |
For some history, see #87999, along with this thread on the python-dev mailing list. |
What's done is done. I am against reverting it.
Lexing comes first. According to the current lexing rules, Keeping track of it is a burden on the tokenizer and parser, as well as on their implementors.
However, I encourage you to fork Python and implement your idea. If that proves to be elegant and works well, people may reconsider your idea. |
No, lexical analysis is greed from left to right. It is definitely I am not sure if there is anything to implement here, I would just drop the totally misleading warning that serves no purpose. Even this: [0x1for x in (1,2)] is perfectly valid and unambiguous: |
What kind of confusing me there is, what is the discussion about? This is well understood behavior over basically all programming languages. I mentioned these things regularly in compiler construction courses. |
I believe that the warnings are primarily intended for, and very helpful to beginners. |
I would be really interested on what data are you basing it. Does it even affect any beginners? Good IDE will show correct separation that is easily seen, bad IDE will show this as syntax error (even though it is not) and most beginners don't write such a dense code anyway. This really sounds like fixing something what is not broken. Based on my anecdotal experience, I have never seen anybody hitting this issue in around 10 years teaching various courses, from basic programming in C to compiler construction. |
I'm not quite sure what you're asking here. My point was that this is not a bug or an artifact of the parser implementation - it's a deliberate design decision aimed at human readers rather than parsers.
The purpose of the
To a parser, sure, but I'd argue that readability for human parsers should be the primary driver. For this issue, I think we should close (since it was reported as a bug, but there's no bug here). @exander77 If you want to argue for a change in direction, that's probably a discussion that would be more fruitful on https://discuss.python.org. That could then be turned into a feature request here if there's consensus that a change is needed. |
It is only valid to a parser because it was made to recognize particular ambiguous syntax. For example it has special code to accept |
I am not sure that reasoning that the change is deliberate means it is not a bug, as the The provided sources actually make a tonne of argument against it (and not really any for it):
I am not sure how Python development is done, but this looks to me like serhiy-storchaka decided to just do it. Even stdlib broke after the change.
Was any analysis even done? I assume that this will be changed from warning to syntax error, so We are basically breaking a tonne of legacy code for no reason? To quote already made point: I am really interested how this was approved into code when the prevalent responses were negative. |
Closing as suggested by @mdickinson. |
Uh oh!
There was an error while loading. Please reload this page.
Bug description:
In the last two years, Python started issuing superfluous misleading warning for perfectly valid code.
Examples:
In the first example,
5for
is not an invalid decimal literal, it is a decimal literal5
followed by keywordfor
which a perfectly valid sequence of lexical elements. Similar to second example2else
.An invalid decimal literal would be using a decimal literal that starts with number:
Where
2for
is used as a name of variable. I would personally call it an invalid identifier (so even official use is confusing).But the cases above are not the cases of invalid literal, they are correctly false. This worked due to how lexical analyzers are constructed as way back as C or even longer. Why did Python start issuing this misleading warning out of the blue?
Also note, that
b'\x03'if
doesn't produce any warning.Can this nonsense be suppressed?
I would note that 10 years ago I wrote code obfuscation/minimizer and deobfuscator/deminizer that decided when whitespace needs to be introduced between two lexical elements based on the fact that you don't need to introduce if joining of two adjacent lexical elements doesn't produce a new lexical element.
CPython versions tested on:
3.11
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: