Fix S3 globbing for duckdbfs with httplib#6709
Conversation
bbba2b6 to
8fea279
Compare
Merging this PR will improve performance by 17.64%
Performance Changes
Comparing Footnotes
|
Signed-off-by: Mikhail Kot <mikhail@spiraldb.com>
8fea279 to
ccbe879
Compare
|
@claude review |
|
Claude finished @0ax1's task in 1m 20s —— View job Code Review
The fix is straightforward and correct. The root cause is well-identified: DuckDB's
|
Using vortex_filesystem=duckdb, both of httpfs backends are broken when using globs on S3 files: 1. httplib was broken because duckdb folder list API returns full paths for S3 files, and we assumed relative paths. This PR fixes this behavior 2. curl was broken because of duckdb/duckdb-httpfs#265 in 1.4. Up until upstream fixes this, globbing won't work. This issue is not present in 1.5 anymore Because of this bug, our benchmarks fail to run with curl backend, see vortex-data#6685 Signed-off-by: Mikhail Kot <mikhail@spiraldb.com>
Using vortex_filesystem=duckdb, both of httpfs backends are broken
when using globs on S3 files:
S3 files, and we assumed relative paths. This PR fixes this behavior
in 1.4. Up until upstream fixes this, globbing won't work.
This issue is not present in 1.5 anymore
Because of this bug, our benchmarks fail to run with curl backend, see #6685