-
Notifications
You must be signed in to change notification settings - Fork 9k
HADOOP-19464. S3A: Restore Compatibility with EMRFS FileSystem #7410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
8000 The reason will be displayed to describe this comment to others. Learn more.
suggested some new tests, to verify semantics across more operations
* This test verifies that the EMRFS or legacy S3N filesystem compatibility with | ||
* S3A works as expected. | ||
*/ | ||
public class ITestEMRFSCompatibility extends AbstractS3ATestBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there other tests here? I think I'd like
- list parent path reports an empty dir
- delete parent dir results in a getFileStatus(parent) => 404, and same for marker.
What does dir rename do? I know for normal / markers we only create a dir marker if there's nothing underneath, but that's just an optimisation. Here we'd want:
touch parent/src/subdir/$folder$
mv parent/src parent/dest
isDir(parent/dest/subdir)
isNotFound(parent/src)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steveloughran - It is a good call out.
list parent path reports an empty dir
This is already covered
delete parent dir results in a getFileStatus(parent) => 404, and same for marker.
This is not happening right now (even with hadoop-3.4.1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added additional coverage
🎊 +1 overall
This message was automatically generated. |
@steveloughran - Thanks a lot for the review. |
🎊 +1 overall
This message was automatically generated. |
@steveloughran Gentle reminder for the review |
+1 pending you saying what your test run parameters were for the full test suite, +what kind of bucket. Really someone needs to make an s3 express bucket their default bucket, shouldn't they... |
Thanks a lot @steveloughran I will setup a S3 express bucket for future testing. Please do let me know if you need any testing for that bucket as well. |
…e#7410) After HADOOP-19278, The S3N folder marker _$folder$ is not skipped during listing of S3 directories. This can lead to the S3A filesystem failing to read data written by the legacy Hadoop S3N filesystem and AWS EMR's EMRFS ("S3" filesystem) Contributed by Syed Shameerur Rahman
…e#7410) After HADOOP-19278, The S3N folder marker _$folder$ is not skipped during listing of S3 directories. This can lead to the S3A filesystem failing to read data written by the legacy Hadoop S3N filesystem and AWS EMR's EMRFS ("S3" filesystem) Contributed by Syed Shameerur Rahman
Description of PR
After HADOOP-19278 , The S3N folder marker _$folder$ is not skipped during listing of S3 directories leading to S3A filesystem not able to read data written by legacy Hadoop S3N filesystem and AWS EMR's EMRFS (S3 filesystem) leading to compatibility issues and possible migration risks to S3A filesystem.
How was this patch tested?
Added integration test
ITestEMRFSCompatibility
and ran other UT/IT inus-east-1
regionFor code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?