-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Description
What would you like to be added:
When csv2db creates a schema, every column is created as VARCHAR(1000).
This default type can be changed by using
-c COLUMN_TYPE, --column-type COLUMN_TYPE
The column type to use for the table generation.
A smart mode would be quite useful.
When this mode is enabled, csv2db would scan the content of the CSV file first and tries to determine the right type per column.
It doesn't need to be super smart.
Maybe beginning with three different types:
- strings by
VARCHAR(1000) - integers via a column that supports integer 64 bit
- floats via a column that supports decimals
A CSV like this
id,name,unit_name,nutrient_nbr,rank
1001,Solids,G,201,200
1002,Nitrogen,G,202,500
1003,Protein,G,203,600
1004,Total lipid (fat),G,204,800
1005,"Carbohydrate, by difference",G,205,1110would lead to something like
CREATE TABLE `nutrient`
(
`ID` int,
`NAME` VARCHAR(1000),
`UNIT_NAME` VARCHAR(1000),
`NUTRIENT_NBR` int,
`RANK` int
);Above is pseudocode, database types for the particular database systems need to be determined).
Why is this needed:
- The
--column-typecan only change the type for all columns, not for specific ones - Queries on the imported data might be more efficient (especially for numbers)
- We would leverage the functionality of the database "a bit more"
Anything else we need to know?:
- We assume CSVs that follow a valid structure
- Database types for the particular database systems need to be determined
- A current work-a-round would be to ALTER the schema after the import
Feedback:
Let me know what you think about this.
If this would be a welcoming feature, I might be able to spent some time on it.
Metadata
Metadata
Assignees
Labels
No labels