8000 fix: escape `%` if URI malformed by baseballyama · Pull Request #5535 · rollup/rollup · GitHub
[go: up one dir, main page]

Skip to content

fix: escape % if URI malformed #5535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 3, 2024
Merged

Conversation

baseballyama
Copy link
Contributor
@baseballyama baseballyama commented May 30, 2024

This PR contains:

  • bugfix
  • feature
  • refactor
  • documentation
  • other

Are tests included?

  • yes (bugfixes and features will not be merged without tests)
  • no

Breaking Changes?

  • yes (breaking changes will not be merged unless absolutely necessary)
  • no

List any relevant issue numbers:

Description

At first I thought of simply replacing % with _, but that would intentionally break all URL-encoded filenames. This is not a good idea, so I used decodeURIComponent to change % to _ only if an abnormal % is present.

At least one major Node.js / browser has decodeURIComponent, but whenever it does not, % is replaced with _.

Copy link
vercel bot commented May 30, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
rollup ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 3, 2024 4:29am

Copy link
github-actions bot commented May 31, 2024

Thank you for your contribution! ❤️

You can try out this pull request locally by installing Rollup via

npm install baseballyama/rollup#fix/5441

Notice: Ensure you have installed the latest stable Rust toolchain. If you haven't installed it yet, please see https://www.rust-lang.org/tools/install to learn how to download Rustup and install Rust.

or load it into the REPL:
https://rollup-q9ie5zutt-rollup-js.vercel.app/repl/?pr=5535

Copy link
github-actions bot commented May 31, 2024

Performance report!

Rough benchmark

Command Mean [s] Min [s] Max [s] Relative
node _benchmark/previous/bin/rollup -i ./perf/entry.js -o _benchmark/result/previous.js 9.254 ± 0.068 9.178 9.310 1.00 ± 0.01
node _benchmark/current/bin/rollup -i ./perf/entry.js -o _benchmark/result/current.js 9.216 ± 0.053 9.176 9.276 1.00

Internal benchmark

  • BUILD: 8213ms, 750 MB
    • initialize: 0ms, 26.1 MB
    • generate module graph: 3129ms, 575 MB
      • generate ast: 1515ms, 567 MB
    • sort and bind modules: 450ms, 619 MB
    • mark included statements: 4610ms, 750 MB
      • treeshaking pass 1: 1559ms, 716 MB
      • treeshaking pass 2: 756ms (+15ms, +2.1%), 741 MB
      • treeshaking pass 3: 296ms, 745 MB
      • treeshaking pass 4: 275ms, 745 MB
      • treeshaking pass 5: 314ms, 748 MB
      • treeshaking pass 6: 262ms, 753 MB
      • treeshaking pass 7: 245ms, 753 MB
      • treeshaking pass 8: 238ms, 755 MB
      • treeshaking pass 9: 216ms, 755 MB
      • treeshaking pass 10: 222ms, 752 MB
      • treeshaking pass 11: 217ms, 750 MB
  • GENERATE: 880ms, 1.02 GB
    • initialize render: 0ms, 914 MB
    • generate chunks: 86ms, 918 MB
      • optimize chunks: 0ms, 918 MB
    • render chunks: 774ms, 1 GB
    • transform chunks: 21ms, 1.02 GB
    • generate bundle: 0ms, 1.02 GB

@lukastaegert
Copy link
Member

Not sure I see the value of decodeURIComponent in the tests at the moment. I would have expected foo%20bar to become foo bar, but this did not happen. Also, the current test does not show how the dynamic import would be rendered, would it really work? I would feel better if we just add _ to the list of forbidden characters.

Copy link
codecov bot commented May 31, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.04%. Comparing base (39bc7c5) to head (7c78b06).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5535   +/-   ##
=======================================
  Coverage   99.04%   99.04%           
=======================================
  Files         238      238           
  Lines        9248     9248           
  Branches     2437     2437           
=======================================
  Hits         9160     9160           
  Misses         58       58           
  Partials       30       30           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@baseballyama
Copy link
Contributor Author
baseballyama commented May 31, 2024

I would have expected foo%20bar to become foo bar, but this did not happen.

I kept foo%20bar because if we convert foo%20bar to foo bar, it will be breaking change because all legal % always be converted.

Therefore I replace only illegal % to _.

@baseballyama
Copy link
Contributor Author

Also, the current test does not show how the dynamic import would be rendered

I added tests.

@lukastaegert
Copy link
Member

I kept foo%20bar because if we convert foo%20bar to foo bar, it will be breaking change because all legal % always be converted.

This will not work if you not also replace the file name on disk. Take the original stackblitz reproduction https://stackblitz.com/edit/rollup-repro-jhg2my?file=dist%2Fmain.js

If you instead use %20 in the file name and build it and try to run it via node dist/main.js then it will still fail. The reason is apparently that the importer does first replace %20 with and then looks for foo bar.js on the disk, which does not exist.

Either you replace it all the time with _, which I would find perfectly fine for me, or you actually use the output of decodeURIComponent. In that case, it would use the non-escaped name also in the dynamic import statement, which would also be ok as JavaScript does not require imports to be URL-encoded as far as I tried it out.

In any case, I do not think this is a breaking change either way because using % in file names was broken before anyway. Or to put it differently, it is impossible to import a file containing "%".

@baseballyama
Copy link
Contributor Author
baseballyama commented Jun 1, 2024

Thank you for checking this PR again!🙏

This will not work if you not also replace the file name on disk. Take the original stackblitz reproduction

I think it will be fixed because decodeURIComponent('./foo%bar-q8VKMzZt.js') throws an exception and goes to the catch statement, replacing % with _.

https://github.com/rollup/rollup/pull/5535/files#diff-6d5ded6774a9966c5b0f5b61b0f8f9f49a4cc4864057dcdb5d8693dd9eac222bR22-R29

If you instead use %20 in the file name and build it and try to run it via node dist/main.js then it will still fail. The reason is apparently that the importer does first replace %20 with and then looks for foo bar.js on the disk, which does not exist.

This is the reason why I don’t use the return value of decodeURIComponent. I’m using a name even if % is contained as much as possible, but if the name cannot be used, I replace % with _.

In any case, I do not think this is a breaking change either way because using % in file names was broken before anyway. Or to put it differently, it is impossible to import a file containing "%".

If the rollup team doesn’t recognize this as a breaking change, I will follow their decision. Obviously, just adding % to INVALID_CHAR_REGEX is much easier than my current changes.

I could be wrong (as this is my first PR in the rollup repo), but I don’t feel that there is a correct common understanding of the changes in this PR. So, for now, I will wait for your response instead of updating the PR.
(I'm not opposite about updating PR👍)

@lukastaegert
Copy link
Member
lukastaegert commented Jun 2, 2024

This is the reason why I don’t use the return value of decodeURIComponent. I’m using a name even if % is contained as much as possible

This is my point, to my understanding it is impossible to import a file that contains a percent sign EXCEPT if the import is written differently from the file name and URL-escapes the percent sign. E.g. if the file is named foo%20bar.js, the only way to import it is via foo%2520bar, encoding the percent sign as %25. At the moment, however, Rollup does not support writing the import different from the file name. Therefore, files on disk should be forbidden from using percent signs one way or another, at least that is my conclusion. But please try this out yourself.

I think it will be fixed because decodeURIComponent('./foo%bar-q8VKMzZt.js') throws an exception and goes to the catch statement, replacing % with _.

This fixes situations where the file name is not a valid URL-encoded string, but if it is valid, the import fails at runtime. That is my point. Just create an empty file foo%20bar.js and another file main.js with content import('./foo%20bar.js') and watch node crash when running node main.js.

@baseballyama
Copy link
Contributor Author

Finally I got the point! Thank you for your explanation!
I updated the PR.

@baseballyama baseballyama requested a review from lukastaegert June 3, 2024 09:38
Copy link
Member
@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for letting this drop for a while, of course this is fine now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generating a chunk with a name containing % will lead to URI malformed error
3 participants
0