8000 feat: Add anywidget mode to display first page results via to_pandas_batches() by shuoweil · Pull Request #1820 · googleapis/python-bigquery-dataframes · GitHub
[go: up one dir, main page]

Skip to content

feat: Add anywidget mode to display first page results via to_pandas_batches() #1820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

shuoweil
Copy link
Contributor

feat: Add anywidget mode to display first page results via to_pandas_batches()
b/391599608

@shuoweil shuoweil requested a review from tswast June 13, 2025 21:02
@shuoweil shuoweil self-assigned this Jun 13, 2025
@shuoweil shuoweil requested review from a team as code owners June 13, 2025 21:02
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Jun 13, 2025

# Instantiate and return the widget. The widget's frontend will
# handle the display of the table and pagination
return AnyWidget(dataframe=first_page)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems odd. Why is this just a generic AnyWidget object? Also, why aren't we giving it the whole iterator?

If we want adjustable page sizes in future, we might even want to pass it self so the widget can call to_pandas_batches().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments, Tim. Here is my response.
(1) why we use a generic AnyWidget object.
It is the simplest way to get data from our Python kernel to the frontend. When we just want to display the first page, what we use here is sufficient enough. However, when we try to introduce pagination button, we need to define a custom widget by providing our own frontend code. It is also the reason why I proposed to check in both first page display (to_panadas_batches()) and pagination buttons at the same time initially.

(2) why we are not give it the whole iterator
It is based on the communication method between Python and browser. Widget state must be serialized to be sent from Python backend to the frontend via JSON. However, iterators are not serializable. Batches (as a python iterator) is a stateful object that lives in Python kernel. We can only send the concrete data that iterator yields. To my understanding, Bigframes is design to handle the datasets are too large to fit in local memory, we can pass the entire dataset at once, but I feel give it the whole iterator would not be a good design choice. However, I am open to your suggestions.

(3) Adjustable page size
Thanks for the suggestion, I can definitely work on this feature.

Thanks a lot.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the simplest way to get data from our Python kernel to the frontend. When we just want to display the first page, what we use here is sufficient enough.

OK for now, but I suspect the next PR you will need to do something different.

It is based on the communication method between Python and browser. Widget state must be serialized to be sent from Python backend to the frontend via JSON. However, iterators are not serializable.

According to https://anywidget.dev/en/getting-started/ only state "defined via traitlets with the sync=True keyword argument" is synced between both. The iterator could be Python-only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback, the iterator can be held as Python-only state. The code we have for now only uses a iterator variable that is local and temporary. I would suggest we check in the paginated widget in a separate PL with a custom frontend representations.

@shuoweil shuoweil force-pushed the shuowei-widget branch 3 times, most recently from 56311bf to 9dccff7 Compare June 16, 2025 23:53
@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: l Pull request size is large. labels Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0