8000 [Routing] Don't needlessly execute strtr's as they are fairly expensive by arjenm · Pull Request #18230 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

[Routing] Don't needlessly execute strtr's as they are fairly expensive #18230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

arjenm
Copy link
Contributor
@arjenm arjenm commented Mar 19, 2016
Q A
Branch? 2.7 (based on Tobion's patch for #18227)
Bug fix? refactoring
New feature? no
BC breaks? no
Deprecations? no
Tests pass? it should
Fixed tickets based on research for ticket #18106
License MIT
Doc PR

Prior to making ticket #18106, I was actually trying to improve the performance of the routing's UrlGenerator. I initially created a fairly complex alternative, but found out a large part of the improved performance could be achieved by a few trivial changes.
In my tests the runtime of 'doGenerate' was spent largely in two 'strtr'-calls. I.e. preventing those from being executed saved a relatively large part of its runtime.

Based on an assumption that it is fairly uncommon to actually need that combined rawurlencode+strtr-decode and it being even more uncommon to have /.. or /. in a generated url, I made those encoding-steps conditional.

When generating the same url 500 times, the runtime of the full generate-call would drop from around 37-45ms to around 17-27ms depending on how complex the specified url is. Even when the rawurlencode is necessary in a url, it is still unlikely that it also contains /.. or /., so the preg_match and strpos should rarely increase the total time.

Note, this is on a system running php 5.6 with xdebug enabled, so the numbers may be a bit inflated.

@Tobion
Copy link
Contributor
Tobion commented Mar 19, 2016

Pleae provide the performance test script. Also it must include cases for the worst case.
As this adds a regex matching to the process I don't think it improves the situation overall.

$url = substr($url, 0, -2).'%2E%2E';
} elseif ('/.' === substr($url, -2)) {
$url = substr($url, 0, -1).'%2E';
if(false !== strpos($url, '/.')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be a good idea. but the rest I don't think so.

@arjenm
Copy link
Contributor Author
arjenm commented Mar 19, 2016

Since the original testing was done with a full url generator, routing file, etc, it was not very easy to extract. I created three seperate tests to attempt to illustrate the difference, the server below is a VM running a php 5.6 without xdebug.

The test for rawurlencode
http://achelois.tweakers.net/~acm/urlrawencode_or_test_first.php

For my test cases:

  • A short url without special characters gains 90%, and with such chars looses about 5%
  • A medium url gains about 80% or looses 10%
  • A long url gains about 70% or looses 10-30%

Obviously, it dependends on both the length of the string and how far to the end the first special character is. If you know of a quicker way to test this than using a regex, that would be great, I couldn't think of one :(

The test for replacing /../ and /./
http://achelois.tweakers.net/~acm/dotsreplace_or_test_first.php

For my test cases:

  • A short url without special characters gains over 90% or looses around 1-3% when there is a /. in it
  • A medium url also gains over 90% and also looses around 3-5%
  • A long url gains over 90% and looses 1-10%

The losses are very variable in this test (ranging from anywhere between 0,5 and 10% for each of the examples on refreshes), so its hard to predict.

And for the query-check
http://achelois.tweakers.net/~acm/queryreplace_or_test_first.php

For my test cases, I created only one:

  • A medium query string gains over 90% and looses around 5%

Conclusion
So the rawurlencode is debatable, it really depends on how likely it is whether there are or aren't special characters in it. The other two seem acceptable losses to me for the rare situations of having /. in a generated url or %2F in a query.

Btw, I also created a completely different variant on PhpGeneratorDumper. In that I basically unrolled the foreach of UrlGenerator into generated php-code. With that, I was able to guarantee in certain scenario's the rawurlencode is never necessary, so there was no runtime-regex required. It also does no foreach, cheaper scheme-validation and so on. Unfortunately, the gain in best-case routes is fairly limited compared to my proposed small set of changes.

If you're interested in it, let me know. Its probably not yet ready for a PR.

@linaori
Copy link
Contributor
linaori commented Mar 20, 2016

Personally I'm interested to see this. I'm currently hitting a small obstacle where I want to cache controller metadata. The problem is that using a the cache component creates a lot of overhead compared to simply requiring a php file where it's either unserialized or generated.

This seems like one of those cases where caching it away as optimized phpcode gives an overall performance increase because it's cached better.

@arjenm
Copy link
Contributor Author
arjenm commented Mar 20, 2016

Ok, have a look at my commit in this branch: https://github.com/arjenm/symfony/tree/faster_url_generator

To enable it, you need to do something like this in your config.yml

parameters:
    router.options.generator_base_class:   Symfony\Component\Routing\Generator\BaseUrlGenerator
    router.options.generator_dumper_class: Symfony\Component\Routing\Generator\Dumper\PhpUnrolledCodeUrlGeneratorDumper

Please note it will generate a class with methods for each route. In my case that's about 310 methods (and somewhere around 350kB of file size) and that works just fine. But there will be scaling limits somewhere, I don't know where the next one will be.
I encountered one already: I initially generated a switch on the route's name to select the routing-method rather than the current hashmap-lookup, but that turned out to be very slow. So much so, it actually made the routing slower for routes at the end of the switch (but faster for those at the start) compared to Symfony's dumper.

Its basically just generating the code of UrlGenerator::doGenerate, but it tries to do as much up-front (like pre-encoding static url-segments and skipping tests for defaults, if there is none configured). All that turned out to be relatively minor in performance gains, compared to simply trying to prevent the strtr()-calls.
But my code does make preventing those calls a lot easier.

@fabpot
Copy link
Member
fabpot commented Mar 21, 2016

Just FYI, URL generation is very commonly the slowest part of a Symfony application; at least, that's what I see when profiling various Symfony apps I have with Blackfire; URL generation is almost always in the top 5 costs.

@arjenm
Copy link
Contributor Author
arjenm commented Mar 21, 2016

That was my reason for an attempt at a faster generator.

If you want to have the most improvement, my full rewrite may be of interest as well. For my test cases it shaved another 3-5ms of the time to generate 500 url's. And it probably has less bad (but different) worst-case performance.
I just don't know where it will hit a performance wall. For instance, is there a (practical) maximum number of routes (and therefore methods on a class) that will be hit by large applications?

@fabpot
Copy link
Member
fabpot commented Mar 21, 2016

@arjenm I think doing this in 2 steps is indeed the best way. Code generation always has some complexity. Also, the performance improvement do not have any BC implication, which eases things a lot. Can you resend a PR for the master the branch? This one seems to be referencing another PR which is not related. Thanks.

@@ -76,6 +76,11 @@ class UrlGenerator implements UrlGeneratorInterface, ConfigurableRequirementsInt
);

/**
* @var string This regexp matches all characters that are not or should not be encoded by rawurlencode (see list in array above).
*/
protected $urlEncodingSkipRegexp = '#[^-.~a-zA-Z0-9_/@:;,=+!*|]#';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it be private ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stof It could be, but all other properties are protected as well, so I'd defiate from the 'default' in that class.

@arjenm arjenm closed this Mar 21, 2016
@arjenm arjenm deleted the ticket_18106 branch March 21, 2016 21:31
fabpot added a commit that referenced this pull request Mar 31, 2016
…airly expensive (arjenm)

This PR was merged into the 2.7 branch.

Discussion
----------

[Routing] Don't needlessly execute strtr's as they are fairly expensive

| Q             | A
| ------------- | ---
| Branch?       | 2.7
| Bug fix?      | refactor
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | see discussion in #18230
| License       | MIT
| Doc PR        | see #18230

As requested in #18230 this is a new version of the prevention of using strtr's. I've posted some performance-numbers in that PR as well.

Commits
-------

b3da6a1 [Routing] Don't needlessly execute strtr's as they are fairly expensive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
0