-
Notifications
You must be signed in to change notification settings - Fork 110
Make sure dates and times are within the supported range #698
Description
Related to src-d/gitbase#805
### Background
According to the docs:
The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'.
This is not exactly true and the behaviour is a little bit more complicated and weird than that.
If you pass the upper limit, NULL is returned:
SELECT CONVERT('10000-12-31 23:59:59', DATETIME)
However, if you set a date lower than the lower bound, it does not:
SELECT CONVERT('0000-12-31 23:59:59', DATETIME)
This returns:
0000-12-31 23:59:59
The issue with Convert
Convert
method of sql.Timestamp
and sql.Date
types. Convert
can either return a value or an error, not return nil. So we need something else to handle conversion to dates that can make them nil, just in case the upper bound is passed.
To illustrate the previous problem, consider this usage inside the code:
value, err := expr.Eval(ctx, row)
if err != nil { /* handle err */ }
if value == nil {
return nil, nil
}
value, err = sql.Date.Convert(value)
if err != nil { /* handle err */ }
The nil value is checked after Eval, not after Convert, because conversions are done on non-nil values and are expected to return non-nil values. To change this behaviour, it would require to add the following block to each usage of a Convert
method:
if value == nil { return nil, nil }
This is potentially dangerous, because people may forget about it and can lead to a panic.
Non-processed values
Consider the following query:
SELECT date_col FROM table
There was no sql.Date.Convert
here. Just a value from the table that will be projected.
We could also check this during the conversion to a SQL value to sent through the wire protocol. But what if table.date_col
is not nullable? The client won't expect a null if the format is invalid.
Proposed solution
A solution that will always work correctly is just wrapping any usage of a DATE or DATETIME with CONVERT(x, DATE)
or CONVERT(x, DATETIME)
. That way, the result will be nullable and the client will receive what it expects. Plus, we wouldn't care what clients send, because at some point it would be sanitized.
A single analyzer rule would solve the whole problem.
Caveats
Could this impact performance? Even if it does have a performance penalty, I don't expect it to be very significative. Mostly because once the value has passed through the first convert it will be just a couple ifs.
WDYT? @src-d/data-processing