"Smart mode" for schema creation

What would you like to be added:

When csv2db creates a schema, every column is created as VARCHAR(1000).
This default type can be changed by using

  -c COLUMN_TYPE, --column-type COLUMN_TYPE
                        The column type to use for the table generation.

A smart mode would be quite useful.
When this mode is enabled, csv2db would scan the content of the CSV file first and tries to determine the right type per column.

It doesn't need to be super smart.
Maybe beginning with three different types:

strings by VARCHAR(1000)
integers via a column that supports integer 64 bit
floats via a column that supports decimals

A CSV like this

id,name,unit_name,nutrient_nbr,rank
1001,Solids,G,201,200
1002,Nitrogen,G,202,500
1003,Protein,G,203,600
1004,Total lipid (fat),G,204,800
1005,"Carbohydrate, by difference",G,205,1110

would lead to something like

CREATE TABLE `nutrient`
(
 `ID` int,
 `NAME` VARCHAR(1000),
 `UNIT_NAME` VARCHAR(1000),
 `NUTRIENT_NBR` int,
 `RANK` int
);

Above is pseudocode, database types for the particular database systems need to be determined).

Why is this needed:

The --column-type can only change the type for all columns, not for specific ones
Queries on the imported data might be more efficient (especially for numbers)
We would leverage the functionality of the database "a bit more"

Anything else we need to know?:

We assume CSVs that follow a valid structure
Database types for the particular database systems need to be determined
A current work-a-round would be to ALTER the schema after the import

Feedback:

Let me know what you think about this.
If this would be a welcoming feature, I might be able to spent some time on it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Smart mode" for schema creation #54

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

"Smart mode" for schema creation #54

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions