-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[HtmlSanitizer] Introduce HtmlSanitizer component #44681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
67e23f0
to
ba2ae30
Compare
Hi. First of all, thank you for you effort, this is really cool idea! Second: I'm not a Symony team member, so feel free to ignore me. My concern is that configuring will create a bunch of temporary objects. Maybe it's better to make a separate builder class, something like: <?php
$config = (new HtmlSanitizerConfigBuilder())
->allowElement('div')
->allowElement('a', ['title'])
->getConfig()
;
$sanitizer = new HtmlSanitizer($config); And call constructor only in the |
src/Symfony/Component/HtmlSanitizer/Parser/Exception/ParsingFailedException.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/TextSanitizer/UrlSanitizer.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/Parser/MastermindsParser.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/Visitor/AttributeSanitizer/AttributeSanitizerInterface.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/TextSanitizer/UrlSanitizer.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/Parser/Exception/ParsingFailedException.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/Parser/MastermindsParser.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/Tests/HtmlSanitizerStandardTest.php
Outdated
Show resolved
Hide resolved
@xorik using a builder for the config could be an alternative indeed, but I think there is an advantage of using an immutable object and withers: the ability to store a "standard" configuration as a service and use it as a prototype to configure in your application. Using a builder would make this more complex while introducing a state in the service, which is IMO not a good idea. Also, having a few more instantiated objects won't have any measurable performance impact, so I think we should stick to an immutable configuration object (immutability was proposed by @stof and @nicolas-grekas and I think that's a great idea). @OskarStark I'm reviewing all your feedback (thanks for it!), I don't think we put final keywords on tests classes anywhere or did we? I feel like this is unnecessary as tests are excluded from the BC promise anyway. Applying the other changes now :) |
f5e7674
to
9967514
Compare
bab4831
to
00d7e22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some random notes :)
src/Symfony/Component/HtmlSanitizer/Visitor/AttributeSanitizer/AHrefSanitizer.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/Visitor/AttributeSanitizer/MediaSrcSanitizer.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/Visitor/Node/DocumentNode.php
Outdated
Show resolved
Hide resolved
src/Symfony/Component/HtmlSanitizer/Visitor/Node/BlockedNode.php
Outdated
Show resolved
Hide resolved
I updated the PR for requested changes. |
It'd be nice to be able to force an attribute on an element, eg rel=nofollow on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some missing return types
General question, is it planned to support only specific browser versions ? I'm asking this because So it could be legitimately ignored because it's an outdated browser but library consumers that still needs to support those browsers (and they exist) will probably require the package and won't realise it actually does not cover their needs by design. If that's the plan I would recommend to be more upfront about the supported browsers and versions, whether it's phpdoc or doc itself. |
I think the aim is to be as safe as possible, ie. ensuring we remove all sources of XSS for all browsers. You raise an important question though: I'm the one who originally explored sources of potential XSS evasions, and I may have missed some. Your example is a good one: I'm going to explore https://cheatsheetseries.owasp.org/cheatsheets/XSS_Filter_Evasion_Cheat_Sheet.html in more details (thanks for the link!) but I think if people reviewing this PR could also have a look at potentially missed XSS evasion strategies, that's be great! |
@tigitz I added a test and a sanitization for your case FYI. @nicolas-grekas I just added the ability to call
I guess one could easily call |
Understood. I guess you evaluate the performance penalty of doing unnecessary "outdated browsers" sanitation checks in a "modern browsers" context to be marginal ?
Sure! For completeness, I found the given link through test suites of other open-source html sanitization libs. Could be useful to increase case coverage: |
I pushed more cases coming from https://cheatsheetseries.owasp.org/cheatsheets/XSS_Filter_Evasion_Cheat_Sheet.html#bgsound (thanks @tigitz !) and I refactored the default attribute sanitizer to check all elements for URL attributes ( |
src/Symfony/Component/HtmlSanitizer/Parser/MastermindsParser.php
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's document list arrays a bit better.
I fixed all reviews :) |
src/Symfony/Component/HtmlSanitizer/Parser/MastermindsParser.php
Outdated
Show resolved
Hide resolved
Updated |
src/Symfony/Component/HtmlSanitizer/TextSanitizer/UrlSanitizer.php
Outdated
Show resolved
Hide resolved
f069b06
to
72381ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the component. I guess you want to proceed with FWB integration in another PR?
@nicolas-grekas I'm working on FWB integration but we could indeed open a new PR for that. I don't mind either way |
72381ee
to
199e405
Compare
Let's open a different PR, it'll ease reviewing of the configuration/FW integration without having too much comments here. @derrabus are you OK with the component now? |
Just opened #44798 for the FWB integration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge early to gather feedback.
Well done @tgalopin
Thank you @tgalopin. |
…t (tgalopin, wouterj) This PR was merged into the 6.1 branch. Discussion ---------- [FrameworkBundle] Integrate the HtmlSanitizer component | Q | A | ------------- | --- | Branch? | 6.1 | Bug fix? | no | New feature? | yes | Deprecations? | no | Tickets | - | License | MIT | Doc PR | - This PR adds the integration if the HtmlSanitizer component in the FrameworkBundle. See #44681 for details about the component. The configuration for this integration is the following: ```yaml framework: # This configuration is not required: as soon as you install the component, a default # "html_sanitizer" service is created, with the safe configuration, to be used directly. # # This configuration allows to set custom behaviors, in addition or instead of the default. html_sanitizer: # Default sanitizer (optional) # When not provided, the native "html_sanitizer" service is wired as default. default: my.sanitizer # Custom sanitizers (optional) sanitizers: # Each sanitizer defines its own service (no prefix/suffix) to ease understanding # Each sanitizer also defines a named autowiring alias to ease injection using variable name. # Here, this sanitizer is available as a service "my.sanitizer" or using autowiring # as "HtmlSanitizerInterface $mySanitizer". my.sanitizer: allow_safe_elements: true allow_elements: iframe: ['src'] custom-tag: ['data-attr'] custom-tag-2: '*' block_elements: - section drop_elements: - video allow_attributes: src: ['iframe'] data-attr: '*' drop_attributes: data-attr: '*' force_attributes: a: rel: noopener noreferrer h1: class: bp4-heading force_https_urls: true allowed_link_schemes: ['http', 'https', 'mailto'] allowed_link_hosts: ['symfony.com'] allow_relative_links: true allowed_media_schemes: ['http', 'https', 'data'] allowed_media_hosts: ['symfony.com'] allow_relative_medias: true # "all.sanitizer" / "HtmlSanitizerInterface $allSanitizer" all.sanitizer: allow_all_static_elements: true allow_elements: custom-tag: ['data-attr'] ``` This PR is still WIP (esp tests) but I wanted to gather feedback regarding the configuration and DX as soon as possible. Commits ------- 4dd3fd6 Finished XML config implementation e0a9339 Integrate the HtmlSanitizer component to the FrameworkBundle
So happy to see this one integrated in SF! Thanks for the job, you all rock. <3 |
Following discussion on #44144, this PR proposes to add a new HtmlSanitizer component to Symfony.
I refactored the infrastructure of my original html-sanitizer to better follow the W3C Standard Proposal about sanitizers.
The sanitizer can be used as follow:
The component is working and well tested.
I did some very basic performance testing and it seems to be **~ 2 times faster** than my original library, mostly by relying on much less objects and more on native primitives. My original library was already as fast as the fastest PHP HTML sanitizers, meaning this component may well be the fastest sanitizer in the PHP ecosystem (to better check of course!). That's exciting :) !
I still have on my TODO list:
Feel free to review the component while I work on finalizing it and the FameworkBundle integration!