8000 ExpressionLanguage unable to parse decimal numbers without leading zeros · Issue #44014 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

ExpressionLanguage unable to parse decimal numbers without leading zeros #44014

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
epixian opened this issue Nov 11, 2021 · 6 comments
Closed

Comments

@epixian
Copy link
epixian commented Nov 11, 2021

Symfony version(s) affected

5.3.7

Description

ExpressionLanguage is unable to parse decimal numbers without leading zeros.

Invalid expressions:
"-.1"
"0-.1"
"0+.1"
".1"

Valid expressions:
"-0.1"
"0-0.1"
"0+0.1"

How to reproduce

>>> $el = new ExpressionLanguage;
=> Symfony\Component\ExpressionLanguage {#4544}
>>> $el->parse('-0.1', [])->getNodes()->dump();
=> "(- 0.1)"
>>> $el->parse('-.1', [])->getNodes()->dump();
Symfony\Component\ExpressionLanguage\SyntaxError with message 'Unexpected token "punctuation" of value "." around position 2 for expression `-.1`.'
>>> $el->parse('0-.1', [])->getNodes()->dump()
Symfony\Component\ExpressionLanguage\SyntaxError with message 'Unexpected token "punctuation" of value "." around position 3 for expression `0-.1`.'
>>> $el->parse('0+.1', [])->getNodes()->dump()
Symfony\Component\ExpressionLanguage\SyntaxError with message 'Unexpected token "punctuation" of value "." around position 3 for expression `0+.1`.'
>>> $el->parse('0+0.1', [])->getNodes()->dump()
=> "(0 + 0.1)"
>>> $el->parse('.1', [])->getNodes()->dump()
Symfony\Component\ExpressionLanguage\SyntaxError with message 'Unexpected token "punctuation" of value "." around position 1 for expression `.1`.'

Possible Solution

No response

Additional Context

No response

@epixian epixian added the Bug label Nov 11, 2021
@derrabus derrabus changed the title ExpressionLanguage unable to parse decimal numbers without leading zeros. ExpressionLanguage unable to parse decimal numbers without leading zeros Nov 11, 2021
@fancyweb
Copy link
Contributor

Status: reviewed

I looked at this. IMO the problem is on the lexer side. The regex used to detect numbers is not in sync with how PHP can currently cast strings to floats. I recomputed the following regex from https://github.com/php/php-src/blob/ddaf64b56c88f0ae223b1aca25293dd7fec77fc0/Zend/zend_language_scanner.l#L1359:
/((?:[0-9]+(?:_[0-9]+)*|(?:(?:[0-9]+(?:_[0-9]+)*)?\.[0-9]+(?:_[0-9]+)*|[0-9]+(?:_[0-9]+)*\.(?:[0-9]+(?:_[0-9]+)*)?))[eE][+-]?[0-9]+(?:_[0-9]+)*|(?:(?:[0-9]+(?:_[0-9]+)*)?\.[0-9]+(?:_[0-9]+)*|[0-9]+(?:_[0-9]+)*\.(?:[0-9]+(?:_[0-9]+)*)?)|[0-9]+(?:_[0-9]+)*)/A

It seems to fix everything and also other cases such as having underscores in the number.

@nicolas-grekas @derrabus Do we consider this a bugfix on 4.4 and craft the regex pattern depending of the PHP version (the underscores were not allowed before 7.4) or do we consider supporting all those new cases as a feature on 6.1?

@nicolas-grekas
Copy link
Member

I consider this a feature, there's no bug here, the syntax is clear: the leading zero is required :)
Why should we support with no leading zero in the first place?

@stof
Copy link
Member
stof commented Nov 15, 2021

The ExpressionLanguage lexer is responsible for supporting the ExpressionLanguage syntax, not the PHP one. So there is indeed no bug here. This is more a feature request.

@stof stof removed the Bug label Nov 15, 2021
@xabbuh xabbuh added the Feature label Nov 15, 2021
@epixian
Copy link
Author
epixian commented Nov 15, 2021

Thank you for looking into this and for the discussion.

the syntax is clear: the leading zero is required :) Why should we support with no leading zero in the first place?

I'm not sure that the syntax is clear. The documentation indeed lists the various supported literals, including "numbers - e.g. 103" but without any caveats that says that these literals are not interpreted/treated the same as PHP. (Perhaps this is mentioned in the code, but that's not the same and I had not gone that deep yet.)

The main argument for supporting no leading zeros is that it seems odd not to. .1 for example is an entirely valid numeric value. On its own or as part of a simple arithmetic expression directly inputted to PHP, it would be (correctly) interpreted as such.

>>> .1
=> 0.1
>>> 1 + .1
=> 1.1

Additionally I think developers would generally expect similar (if not exact) behavior, and as the main reason we started using ExpressionLanguage was to validate/parse user input, not to support an otherwise valid input requires a need to pre-validate/parse the entire input just to insert the missing leading zeros, the very task we wanted to avoid.

In the meantime, should we update the documentation to reflect this? I would be happy to submit a PR over there.

@stof
Copy link
Member
stof commented Nov 15, 2021

Well, the ExpressionLanguage lexer was originally created based on the Twig lexer (the syntax to access properties and methods is inspired by Twig, with the magic of twig_get_attribute() removed to allow compiling the code to a runtime-free expression). And the supported number literals are the same in ExpressionLanguage and Twig. Both require a leading digit.

@fancyweb
Copy link
Contributor

I opened #44073 since I had already made the "work". Let's see where it goes there 😄

nicolas-grekas added a commit that referenced this issue Feb 24, 2022
…scores and decimals with no leading zero (fancyweb)

This PR was merged into the 6.1 branch.

Discussion
----------

[ExpressionLanguage] Support lexing numbers with underscores and decimals with no leading zero

| Q             | A
| ------------- | ---
| Branch?       | 6.1
| Bug fix?      | no
| New feature?  | yes
| Deprecations? | no
| Tickets       | #44014
| License       | MIT
| Doc PR        | -

I understand that the supported Expression Language syntax is, and can be different than the PHP syntax, however, like `@epixian`, I would have expected `.123` to work since it's a common notation.
Supporting the numeric literal separator `_` looks good to me for the readability (the expression language is still code IMO). As most of us are PHP devs, we know this syntax.
Supporting those new syntaxes doesn't really bring more complexity in the code and it allows us to realign with the PHP behavior (thus having a guide) which is not something mandatory but something a random user would expect I think, that's why it improves the DX IMHO.
We could make the change in Twig too for the same reasons.
We could also support the hexadecimal, binary and octal syntaxes but let's wait for someone that actually needs it 😃

Commits
-------

bb56b2e [ExpressionLanguage] Support lexing numbers with underscores and decimals with no leading zero
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants
0