8000 Overwritten Format Configs by CreateExternalTable Options · Issue #9945 · apache/datafusion · GitHub
[go: up one dir, main page]

Skip to content
Overwritten Format Configs by CreateExternalTable Options #9945
@berkaysynnada

Description

@berkaysynnada

Describe the bug

let file_format: Arc<dyn FileFormat> = match file_type {
FileType::CSV => {
let mut csv_options = table_options.csv;
csv_options.has_header = cmd.has_header;
csv_options.delimiter = cmd.delimiter as u8;
csv_options.compression = cmd.file_compression_type;
Arc::new(CsvFormat::default().with_options(csv_options))

These lines of ListingTableFactory.create() overwrites the config options by cmd (CreateExternalTable options). Configs coming from OPTIONS('format... are discarded silently.

To Reproduce

statement ok
CREATE EXTERNAL TABLE aggregate_simple (
  c1 FLOAT NOT NULL,
  c2 DOUBLE NOT NULL,
  c3 BOOLEAN NOT NULL
)
STORED AS CSV
LOCATION '../core/tests/data/aggregate_simple.csv'
WITH HEADER ROW

query RRB
SELECT * FROM aggregate_simple LIMIT 3;
----

works as expected, but

statement ok
CREATE EXTERNAL TABLE aggregate_simple (
  c1 FLOAT NOT NULL,
  c2 DOUBLE NOT NULL,
  c3 BOOLEAN NOT NULL
)
STORED AS CSV
LOCATION '../core/tests/data/aggregate_simple.csv'
OPTIONS('format.has_header' 'true')

query RRB
SELECT * FROM aggregate_simple LIMIT 3;
----

does not, since CREATE EXTERNAL TABLE does not have WITH HEADER ROW, which overwrites csv.has_header to false.

Expected behavior

The actual problem is that we can set the same settings by 2 different commands now, and one of them is silently chosen. Their default values are also different (CreateExternalTable's false, CsvOptions' true). We can set both of them as false initially. Then if one of them is true, then we expect a header. If WITH HEADER ROW exists and 'format.has_header' 'false', we can give an error.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0