-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Labels
api: bigqueryIssues related to the BigQuery API.Issues related to the BigQuery API.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Description
load_table_from_dataframe currently uses BytesIO when it serializes a pandas dataframe to parquet before uploading it via a load job. This is actually violating the contract for to_parquet, which requires a filepath. BytesIO happens to work when pyarrow is used but not with fastparquet. A more minor reason we may wish to serialize to disk is that dataframe can sometimes be quite large, so spilling to disk would be preferable to filling up memory. Note: the function should clean up after itself by removing the temp file after the load job completes.
Metadata
Metadata
Assignees
Labels
api: bigqueryIssues related to the BigQuery API.Issues related to the BigQuery API.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.