8000 perf: Improve isin performance by TrevorBergeron · Pull Request #1203 · googleapis/python-bigquery-dataframes · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@TrevorBergeron
Copy link
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Dec 11, 2024
@TrevorBergeron TrevorBergeron requested a review from tswast January 17, 2025 00:16
@TrevorBergeron TrevorBergeron marked this pull request as ready for review January 17, 2025 00:16
@TrevorBergeron TrevorBergeron requested review from a team as code owners January 17, 2025 00:16


class AdditiveNode:
"""Definition of additive - if you drop added_fields, you end up with the descendent."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A picture might help :-)

Suggested change
"""Definition of additive - if you drop added_fields, you end up with the descendent."""
"""Definition of additive - if you drop added_fields, you end up with the descendent.
.. code-block:: text
AdditiveNode (fields: a, b, c; added_fields: c)
|
| additive_base
V
BigFrameNode (fields: a, b)
"""

See https://stackoverflow.com/a/50956831/101923 for creating a plain text block

Comment on lines +480 to +481
def replace_additive_base(self, node: BigFrameNode):
return dataclasses.replace(self, left_child=node)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking aloud: I wonder if there's some way we can organize these sorts of tree transformations so that it's easier to reason about which can be applied in which order?

@TrevorBergeron TrevorBergeron enabled auto-merge (squash) January 29, 2025 18:50
@TrevorBergeron TrevorBergeron merged commit db087b0 into main Jan 29, 2025
23 checks passed
@TrevorBergeron TrevorBergeron deleted the isin_join branch January 29, 2025 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

0