-
Notifications
You must be signed in to change notification settings - Fork 1.3k
contenthash: do not follow nested symlinks when computing checksum #6220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
contenthash: do not follow nested symlinks when computing checksum #6220
Conversation
50bd827
to
0c3f539
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add (some) integration tests for this as well. Current unit tests only check the cache checksum computation, not that there isn't an error in the actual copy phase of the feature or that the correct files are being copied.
expectedSym = digest.Digest("sha256:e14a98332f46b81993ff4a3b7898bb00fc3d245d84e4c42e57c9ee6b129c7c41") | ||
|
||
// Same with wildcard used. | ||
dgst, err = cc.Checksum(context.TODO(), ref, "fo?", ChecksumOpts{FollowLinks: true, Wildcard: true}, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this one returning a different checksum than foo
. I guess the last one is returning a different checksum because it is the list of all the subpaths (technically we could detect the case where excludepatterns doesn't match anything but that doesn't need to be part of this PR if it is new optimization).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea it returns a different checksum because the non-wildcard one computes the checksum of foo
by itself and this one computes the checksum of foo
and the contents without following the symlink.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checksum of foo by itself
But "foo itself" also contains the contents (without following the symlink).
This is likely unrelated to this change, but I've forgotten what the difference is because afaics these commands copy identical files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an integration test for this.
0c3f539
to
0a83b16
Compare
b69f184
to
e04696e
Compare
Nested symlinks were not intended to be followed after the first level of wildcard expansion. When include/exclude patterns were used or a wildcard was present higher up in the tree, we would erroneously follow ALL symlinks to compute the hash. When a symlink was broken, this would result in an error. We weren't supposed to be following the symlinks at all. This fixes things by changing `includedPaths` to also return whether a path should follow links. If a parent directory exists for the path, then the symlink is not followed. Signed-off-by: Jonathan A. Sternberg <jonathan.sternberg@docker.com>
e04696e
to
74d1779
Compare
Opened a follow-up issue in #6233 , but doesn't seem to be related to changes in this PR. |
Nested symlinks were not intended to be followed after the first level
of wildcard expansion. When include/exclude patterns were used or a
wildcard was present higher up in the tree, we would erroneously follow
ALL symlinks to compute the hash.
When a symlink was broken, this would result in an error. We weren't
supposed to be following the symlinks at all.
This fixes things by changing
includedPaths
to also return whether apath should follow links. If a parent directory exists for the path,
then the symlink is not followed.
Fixes #4946.