-
Notifications
You must be signed in to change notification settings - Fork 50
feat: Add anywidget mode to display first page results via to_pandas_batches() #1820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
bigframes/dataframe.py
Outdated
|
||
# Instantiate and return the widget. The widget's frontend will | ||
# handle the display of the table and pagination | ||
return AnyWidget(dataframe=first_page) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems odd. Why is this just a generic AnyWidget object? Also, why aren't we giving it the whole iterator?
If we want adjustable page sizes in future, we might even want to pass it self
so the widget can call to_pandas_batches()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments, Tim. Here is my response.
(1) why we use a generic AnyWidget object.
It is the simplest way to get data from our Python kernel to the frontend. When we just want to display the first page, what we use here is sufficient enough. However, when we try to introduce pagination button, we need to define a custom widget by providing our own frontend code. It is also the reason why I proposed to check in both first page display (to_panadas_batches()) and pagination buttons at the same time initially.
(2) why we are not give it the whole iterator
It is based on the communication method between Python and browser. Widget state must be serialized to be sent from Python backend to the frontend via JSON. However, iterators are not serializable. Batches (as a python iterator) is a stateful object that lives in Python kernel. We can only send the concrete data that iterator yields. To my understanding, Bigframes is design to handle the datasets are too large to fit in local memory, we can pass the entire dataset at once, but I feel give it the whole iterator would not be a good design choice. However, I am open to your suggestions.
(3) Adjustable page size
Thanks for the suggestion, I can definitely work on this feature.
Thanks a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the simplest way to get data from our Python kernel to the frontend. When we just want to display the first page, what we use here is sufficient enough.
OK for now, but I suspect the next PR you will need to do something different.
It is based on the communication method between Python and browser. Widget state must be serialized to be sent from Python backend to the frontend via JSON. However, iterators are not serializable.
According to https://anywidget.dev/en/getting-started/ only state "defined via traitlets with the sync=True keyword argument" is synced between both. The iterator could be Python-only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your feedback, the iterator can be held as Python-only state. The code we have for now only uses a iterator variable that is local and temporary. I would suggest we check in the paginated widget in a separate PL with a custom frontend representations.
56311bf
to
9dccff7
Compare
feat: Add anywidget mode to display first page results via to_pandas_batches()
b/391599608