8000 [Form][TwigBridge] Handle MoneyType with non UTF-8 charset by ro0NL · Pull Request #25167 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

[Form][TwigBridge] Handle MoneyType with non UTF-8 charset #25167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

[Form][TwigBridge] Handle MoneyType with non UTF-8 charset #25167

wants to merge 1 commit into from

Conversation

ro0NL
Copy link
Contributor
@ro0NL ro0NL commented Nov 26, 2017
Q A
Branch? 2.7
Bug fix? yes
New feature? no
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets #25135
License MIT
Doc PR symfony/symfony-docs#...

Prevent wrong charset conversion in MoneyType when using a charset other then UTF-8.

@nzurita can you confirm it fixes your issue?

Copy link
Contributor Author
@ro0NL ro0NL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general there are some xpath assertions like [contains(.., "€")] which continue to match against entities.

@@ -23,11 +23,11 @@
<div class="input-group">
{% set append = money_pattern starts with '{{' %}
{% if not append %}
<span class="input-group-addon">{{ money_pattern|replace({ '{{ widget }}':''}) }}</span>
<span class="input-group-addon">{{ money_pattern|replace({ '{{ widget }}':''})|raw }}</span>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now assuming money_pattern is a safe value. I think it's truly intended looking at money_widget.html.php and form_div_layout.html.twig.

8000
@@ -113,9 +113,9 @@ protected static function getPattern($currency)
preg_match('/^([^\s\xc2\xa0]*)[\s\xc2\xa0]*123(?:[,.]0+)?[\s\xc2\xa0]*([^\s\xc2\xa0]*)$/u', $pattern, $matches);

if (!empty($matches[1])) {
self::$patterns[$locale][$currency] = $matches[1].' {{ widget }}';
self::$patterns[$locale][$currency] = htmlentities($matches[1], ENT_QUOTES | ENT_SUBSTITUTE, 'UTF-8').' {{ widget }}';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protected getPattern is called privately (self::getPattern) here, so no real BC issue i guess (assuming ppl might override moneytype returning a non-safe value or so).

We could move htmlentities to buildView perhaps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this part of the code know anything about HTML already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess not. Escaping from buildView() might be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or in the template directly (I don't know well this part of the code base)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue is htmlentities vs. htmlspecialchars. We truly need htmlentites as escaping strategy, which is not possible in twig (AFAIK).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with a formatter in the Twig bridge it becomes possible

Copy link
Contributor Author
@ro0NL ro0NL Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

htmlentities formatter seems more something for core, dont you think. So from twig it would be like

{{ money_pattern | convert_encoding('UTF-8', _charset) | e('entities') }}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not, you should try a PR on twig if you want to go down this path.
But without encoding conversion - entities cannot operate if the source encoding varies (or it needs to know about it, so that conversion should be handled there anyway)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be a long road then; we cant use new twig features in 2.7 bridge i guess.

Another path might be to inject kernel charset in moneytype, making it a service again. And go from there.

In general; it might become to much work. We could still settle with the current approach (keep a UTF-8 string; which happens today already).

Let me try a few things this week :)

@@ -62,7 +62,7 @@ protected function assertMatchesXpath($html, $expression, $count = 1)
try {
// Wrap in <root> node so we can load HTML with multiple tags at
// the top level
$dom->loadXML('<root>'.$html.'</root>');
$dom->loadHTML('<!DOCTYPE html><html><body>'.$html.'</body></html>');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using loadHTML enables HTML entities (Entity 'euro' not defined otherwise). It makes sense; we're in fact loading $html.

$this->fail(sprintf(
"Failed asserting that \n\n%s\n\nmatches exactly %s. Matches %s in \n\n%s",
$expression,
1 == $count ? 'once' : $count.' times',
1 == $nodeList->length ? 'once' : $nodeList->length.' times',
// strip away <root> and </root>
substr($dom->saveHTML(), 6, -8)
$html
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatted output not worth the hassle IMHO. As saveHTML includes a doctype declaration either way.

8000
@ro0NL
Copy link
Contributor Author
ro0NL commented Nov 26, 2017

The thing im not sure about is using UTF-8 hardcoded in htmlentites, without really knowing about the application charset. As in twig it's now a safe value.

I believe a missing step is to inject the kernel charset, and convert/encode as needed.

@nzurita
Copy link
nzurita commented Nov 27, 2017

@ro0NL works fine, tested UTF-8, ISO-8859-1 and ISO-8859-15:

imagen
imagen

Thank you

@nicolas-grekas nicolas-grekas added this to the 2.7 milestone Nov 27, 2017
@ro0NL
Copy link
Contributor Author
ro0NL commented Nov 27, 2017

This should do.

  • Assign a html entiy encoded UTF-8 string to template
  • Convert to app charset (e.g. ISO) in template
  • Replace widget placeholder in string
  • Render string raw (as it's already html entity encoded using ENT_QUOTES)

Fixes an inconsistency between bootstrap_3_layout.html.twig vs. money_widget.html.php and form_div_layout.html.twig, as the latter were already rendering raw. But this is a tricky part yes.

convert_encoding might be optimized to skip conversion if in/out charsets are the same, if PHP doenst already do that. I didnt want to add the boilerplate "if" here.

No conversion happens in PHP template; let me know how to get 8000 the app charset here and if we can use iconv.

In general, for using html entities, if think we're pretty much safe. At least, i cant imagine it be worse than today's issue.

cc @nzurita @derrabus

Copy link
Member
@nicolas-grekas nicolas-grekas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(+rebase needed)

@@ -23,11 +23,11 @@
<div class="input-group">
{% set append = money_pattern starts with '{{' %}
{% if not append %}
<span class="input-group-addon">{{ money_pattern|replace({ '{{ widget }}':''}) }}</span>
<span class="input-group-addon">{{ money_pattern|convert_encoding(_charset, 'UTF-8')|replace({ '{{ widget }}':''})|raw }}</span>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really think this should be |htmlentities
we should add the filter to FormExtension

Copy link
Contributor Author
@ro0NL ro0NL Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Will patch soonish.

Also templates are UTF8 encoded by default right? Technically we shouldnt have to do any encoding conversion at all here. Instead the rendered template as a whole should be converted depending on the kernel charset i guess. Let's not go there for now.

Implicitly meaning the linked issue is (technically) not supported? Anyway.. htmlentities will settle :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

templates are UTF8 encoded by default right [...] template as a whole should be converted

That's not how Twig deals with charsets - so templates are ASCII that's the only thing we can tell from here.

@@ -43,7 +43,7 @@ public function buildForm(FormBuilderInterface $builder, array $options)
*/
public function buildView(FormView $view, FormInterface $form, array $options)
{
$view->vars['money_pattern'] = self::getPattern($options['currency']);
$view->vars['money_pattern'] = htmlentities(self::getPattern($options['currency']), ENT_QUOTES | (defined('ENT_SUBSTITUTE') ? ENT_SUBSTITUTE : 0), 'UTF-8');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be removed, buildView does not output html, that's not the correct layer

@nicolas-grekas
Copy link
Member

ping @ro0NL :)

@ro0NL
Copy link
Contributor Author
ro0NL commented Mar 3, 2018

@nicolas-grekas for 4.1? given the new htmlentities filter.

@nicolas-grekas
Copy link
Member

@ro0NL let's propose on 2.7 first, adding the filter since we need it. Can you do it these days?

@ro0NL
Copy link
Contributor Author
ro0NL commented Mar 27, 2018

Closing in favor of #26663

@ro0NL ro0NL closed this Mar 27, 2018
@ro0NL ro0NL deleted the moneytype branch March 27, 2018 16:47
fabpot added a commit that referenced this pull request Mar 29, 2018
This PR was squashed before being merged into the 2.7 branch (closes #26663).

Discussion
----------

[TwigBridge] Fix rendering of currency by MoneyType

| Q             | A
| ------------- | ---
| Branch?       | 2.7
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | todo
| Fixed tickets | #25135
| License       | MIT
| Doc PR        | -

Split from #25167 as suggested by @nicolas-grekas to see if this can be reasonably fixed on 2.7, the right way using this itsy-bitsy new feature.

#25167 still contains some valuable changes regarding tests. Ill continue either one PR depending on the target branch / proposed fix.

Commits
-------

a3a2ff0 [TwigBridge] Fix rendering of currency by MoneyType
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0