8000 feat: (Preview) Support arithmetics between dates and timedeltas by sycai · Pull Request #1413 · googleapis/python-bigquery-dataframes · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@sycai
Copy link
Contributor
@sycai sycai commented Feb 20, 2025

Supported operations:

  • date - date
  • timedelta + date
  • date + timedelta
  • date - timedelta

Test cases date series + timedelta series and date series - timedelta series are not supported by Pandas until 2.1.0, but our test setup uses version 1.5.3. However, I cannot increase the pandas version accordingly in test env 3.12 because it would introduce more breaking changes in irrelevant tests.

That being said, operations between a series and a literal are still covered, and we will fill the gap once our Pandas dependency is increased to 2.1.0

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Feb 20, 2025
@sycai sycai changed the title [WIP] Implement date_diff, date_add, and date_sub. feat: (Preview) Support arithmetics between dates and timedeltas Feb 21, 2025
@sycai sycai marked this pull request as ready for review February 25, 2025 20:18
@sycai sycai requested review from a team as code owners February 25, 2025 20:18
@sycai sycai requested a review from ZehaoXU February 25, 2025 20:18
@sycai sycai requested review from TrevorBergeron, chelsea-lin and tswast and removed request for ZehaoXU February 25, 2025 20:18

@scalar_op_compiler.register_binary_op(ops.date_diff_op)
def date_diff_op_impl(x: ibis_types.DateValue, y: ibis_types.DateValue):
return (x.delta(y, "day") * UNIT_TO_US_CONVERSION_FACTORS["d"]).floor() # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the floor() needed for? don't we already have an integer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call!

UNIT_TO_US_CONVERSION_FACTORS["d"] has float type even though the value itself is an integer, so the multiplication gives us an Ibis.FloatValue. Calling floor() over the value was to cast it into Ibis.IntValue.

That said, floor() does seem unnecessary here. I used int(..) to cast the factor value instead, as it simplifies the generated SQL a bit.


@scalar_op_compiler.register_binary_op(ops.date_add_op)
def date_add_op_impl(x: ibis_types.DateValue, y: ibis_types.IntegerValue):
return x.cast("timestamp") + y.to_interval("us") # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we convert to interval datatype? I would rather just use https://cloud.google.com/bigquery/docs/reference/standard-sql/date_functions#date_add with an integer.

Copy link
Contributor Author
@sycai sycai Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the interval conversion because this is how Ibis generates date_add expressions. In other words, Ibis only recognizes DateValue + IntervalValue to get DATE_ADD(Date, Interval ...)


@scalar_op_compiler.register_binary_op(ops.date_sub_op)
def date_sub_op_impl(x: ibis_types.DateValue, y: ibis_types.IntegerValue):
return x.cast("timestamp") - y.to_interval("us") # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, would rather avoid casting to interval type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as above.

@sycai sycai enabled auto-merge (squash) February 26, 2025 20:58
@sycai sycai merged commit 962b152 into main Feb 26, 2025
23 checks passed
@sycai sycai deleted the sycai_date_arithmetics branch February 26, 2025 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

0