-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Describe the bug
datafusion/datafusion/core/src/datasource/listing_table_factory.rs
Lines 67 to 73 in 4bd7c13
let file_format: Arc<dyn FileFormat> = match file_type { | |
FileType::CSV => { | |
let mut csv_options = table_options.csv; | |
csv_options.has_header = cmd.has_header; | |
csv_options.delimiter = cmd.delimiter as u8; | |
csv_options.compression = cmd.file_compression_type; | |
Arc::new(CsvFormat::default().with_options(csv_options)) |
These lines of ListingTableFactory.create()
overwrites the config options by cmd (CreateExternalTable options). Configs coming from OPTIONS('format...
are discarded silently.
To Reproduce
statement ok
CREATE EXTERNAL TABLE aggregate_simple (
c1 FLOAT NOT NULL,
c2 DOUBLE NOT NULL,
c3 BOOLEAN NOT NULL
)
STORED AS CSV
LOCATION '../core/tests/data/aggregate_simple.csv'
WITH HEADER ROW
query RRB
SELECT * FROM aggregate_simple LIMIT 3;
----
works as expected, but
statement ok
CREATE EXTERNAL TABLE aggregate_simple (
c1 FLOAT NOT NULL,
c2 DOUBLE NOT NULL,
c3 BOOLEAN NOT NULL
)
STORED AS CSV
LOCATION '../core/tests/data/aggregate_simple.csv'
OPTIONS('format.has_header' 'true')
query RRB
SELECT * FROM aggregate_simple LIMIT 3;
----
does not, since CREATE EXTERNAL TABLE
does not have WITH HEADER ROW
, which overwrites csv.has_header
to false
.
Expected behavior
The actual problem is that we can set the same settings by 2 different commands now, and one of them is silently chosen. Their default values are also different (CreateExternalTable's false, CsvOptions' true). We can set both of them as false initially. Then if one of them is true, then we expect a header. If WITH HEADER ROW
exists and 'format.has_header' 'false'
, we can give an error.
Additional context
No response