fix: DSL_SCHEMA_HASH should not changed by line endings #25123
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ref: pola-rs/r-polars#1625
If we reference Polars on GitHub and build it from Windows, Git for Windows may rewrite the line ending characters from LF to CRLF, resulting in a different value for
DSL_SCHEMA_HASH.By replacing CRLF to LF before hashing, we can prevent such accidents.
Currently, Python Polars on Windows has a different
DSL_SCHEMA_HASHcompared to other platforms:6286bad7b59c6dffbabcb7eda0d3a1c386cc651ad8b2479b2e4e4686199e04e145389c69048be32a61fcfa8990e6bd529f3d369c229e6004318f7e15baa0b597As shown below, it can be confirmed that this is the result of replacing the line ending characters from LF to CRLF.
$ wget \ https://raw.githubusercontent.com/pola-rs/polars/df69276daf5d195c8feb71eef82cbe9804e0f47f/crates/polars-plan/dsl-schema-hashes.json \ --quiet $ sha256sum dsl-schema-hashes.json 6286bad7b59c6dffbabcb7eda0d3a1c386cc651ad8b2479b2e4e4686199e04e1 dsl-schema-hashes.json $ sed -e 's/$/\r/g' dsl-schema-hashes.json | head -c -1 >crlf.json $ sha256sum crlf.json 45389c69048be32a61fcfa8990e6bd529f3d369c229e6004318f7e15baa0b597 crlf.json