8000 Optimize CGI.escapeHTML for ASCII-compatible encodings by k0kubun · Pull Request #1164 · ruby/ruby · GitHub
  • [go: up one dir, main page]

    Skip to content

    Conversation

    @k0kubun
    Copy link
    Member
    @k0kubun k0kubun commented Dec 20, 2015

    As commented in #156 (comment), I rewrote CGI.escapeHTML in C, which is used by ERB::Util#html_escape.
    Since escaping HTML is expensive in rendering a template, I want it to be faster.
    For now, I optimized it only for strings whose encoding is ASCII-compatible.

    With this benchmark https://gist.github.com/k0kubun/b6af6062bc876190e280, it's about 7 times faster than original implementation in escaping html.

    $ ruby bench_escape_html.rb
    Calculating -------------------------------------
                  before    11.448k i/100ms
                   after    31.189k i/100ms
    -------------------------------------------------
                  before    216.403k (± 7.0%) i/s -      2.152M
                   after      1.637M (±10.0%) i/s -     16.125M
    
    Comparison:
                   after:  1637408.5 i/s
                  before:   216403.5 i/s - 7.57x slower
    

    Copy link
    Member

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    this function returns no values.

    Copy link
    Member Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Thank you. Fixed in k0kubun@39cdd83.

    Copy link
    Member

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    the encoding of str is ignored and dest is always ASCII-8BIT.

    Copy link
    Member Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    I changed it to associate original encoding and added test case for it in k0kubun@2162835.

    @knu
    Copy link
    Member
    knu commented Dec 24, 2015

    How much is the final performance gain after fixing the return value not being duped when no replacement takes place?

    @k0kubun
    Copy link
    Member Author
    k0kubun commented Dec 24, 2015

    You can check performance gain with this benchmark https://gist.github.com/k0kubun/8e1c7efb1e29991e1382.

    This is the result of ruby compiled from latest revision b58b970.

    $ ruby bench_escape.rb "'&\"<>"
    Escape: '&"<>
    Calculating -------------------------------------
                  before    15.457k i/100ms
                   after    63.931k i/100ms
    -------------------------------------------------
                  before    199.975k (± 5.1%) i/s -      1.005M
                   after      1.453M (± 5.8%) i/s -      7.288M
    
    Comparison:
                   after:  1453098.6 i/s
                  before:   199974.8 i/s - 7.27x slower
    
    $ ruby bench_escape.rb "hello world"
    Escape: hello world
    Calculating -------------------------------------
                  before    56.973k i/100ms
                   after   100.344k i/100ms
    -------------------------------------------------
                  before      1.474M (± 6.1%) i/s -      7.350M
                   after      4.419M (± 7.3%) i/s -     21.975M
    
    Comparison:
                   after:  4419281.2 i/s
                  before:  1473860.3 i/s - 3.00x slower
    

    @knu
    Copy link
    Member
    knu commented Dec 24, 2015

    Thanks. So, the new implementation is 3x faster even in the worst cases. Great job!

    mrkn pushed a commit to mrkn/ruby that referenced this pull request Apr 17, 2016
    * cgi/escape/escape.c: Optimize CGI.escapeHTML for
      ASCII-compatible encodings.  [Fix rubyGH-1164]
    
    git-svn-id: svn+ssh://svn.ruby-lang.org/ruby/trunk@53220 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
    k0kubun added a commit to k0kubun/haml that referenced this pull request Feb 8, 2017
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    3 participants

    0