8000 BigQuery: Upload pandas DataFrame containing arrays · Issue #19 · googleapis/python-bigquery · GitHub
[go: up one dir, main page]

Skip to content

BigQuery: Upload pandas DataFrame containing arrays #19

@AETDDraper

Description

@AETDDraper

The support for python Bigquery API indicates that arrays are possible, however, when passing from a pandas dataframe to bigquery there is a pyarrow struct issue.

The only way round it seems its to drop columns then use JSON Normalise for a separate table.

from google.cloud import bigquery

project = 'lake'
client = bigquery.Client(credentials=credentials, project=project)
dataset_ref = client.dataset('XXX')
table_ref = dataset_ref.table('RAW_XXX')
job_config = bigquery.LoadJobConfig()
job_config.autodetect = True
job_config.write_disposition = 'WRITE_TRUNCATE'

client.load_table_from_dataframe(appended_data, table_ref,job_config=job_config).result()

This is the error recieved. NotImplementedError: struct

The reason I wanted to use this API as it indicates Nested Array support, which is perfect for our data lake in BQ but I assume this doesn't work?

Metadata

Metadata

Labels

api: bigqueryIssues related to the googleapis/python-bigquery API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0