8000 BigQuery: Client.query modifies job_config object · Issue #9727 · googleapis/google-cloud-python · GitHub
[go: up one dir, main page]

Skip to content

BigQuery: Client.query modifies job_config object #9727

@tswast

Description

@tswast

Steps to reproduce

  1. Create a QueryJobConfig object.
  2. Pass the object to the Client.query method.
  3. Observe that the configuration object has changed.

Code example

from google.cloud import bigquery

client = bigquery.Client()
config = bigquery.QueryJobConfig(dry_run=True)
job = client.query(
    """
    SELECT name, SUM(number) AS total_people
    FROM `bigquery-public-data.usa_names.usa_1910_current`
    GROUP BY name
    """,
    job_config=config
)
print(job.total_bytes_processed)

config.dry_run = False
job = client.query(
    """
    SELECT name, SUM(number) AS total_people
    FROM `bigquery-public-data.usa_names.usa_1910_current`
    GROUP BY name
    """,
    job_config=config
)
job.result()

Stack trace

---------------------------------------------------------------------------
BadRequest                                Traceback (most recent call last)
<ipython-input-17-97f4f20faa6a> in <module>
----> 1 job.result()

~/miniconda3/envs/scratch/lib/python3.7/site-packages/google/cloud/bigquery/job.py in result(self, timeout, page_size, retry, max_results)
   2937         """
   2938         try:
-> 2939             super(QueryJob, self).result(timeout=timeout)
   2940 
   2941             # Return an iterator instead of returning the job.

~/miniconda3/envs/scratch/lib/python3.7/site-packages/google/cloud/bigquery/job.py in result(self, timeout, retry)
    732             self._begin(retry=retry)
    733         # TODO: modify PollingFuture so it can pass a retry argument to done().
--> 734         return super(_AsyncJob, self).result(timeout=timeout)
    735 
    736     def cancelled(self):

~/miniconda3/envs/scratch/lib/python3.7/site-packages/google/api_core/future/polling.py in result(self, timeout)
    125             # pylint: disable=raising-bad-type
    126             # Pylint doesn't recognize that this is valid in this case.
--> 127             raise self._exception
    128 
    129         return self._result

BadRequest: 400 Cannot explicitly modify anonymous table swast-scratch:_def2aa82a75fc33513bfb65968607e1f49148f83.anon561d4b35377f56c5960a4a49be253d3c90c120db

(job ID: 2692472c-1213-4716-8393-d9c49fd7b678)

                -----Query Job SQL Follows-----                

    |    .    |    .    |    .    |    .    |    .    |
   1:
   2:    SELECT name, SUM(number) AS total_people
   3:    FROM `bigquery-public-data.usa_names.usa_1910_current`
   4:    GROUP BY name
   5:    
    |    .    |    .    |    .    |    .    |    .    |

When you run a job, the configuration object may be modified. In this example, it sets the destination table. It is unexpected that the original Python object would be modified.

I propose that the query (and other "job" methods on client) be updated to make a deep copy of the the job_config argument before passing it on to the job constructor.

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.priority: p1Important issue which blocks shipping the next release. Will be fixed prior to next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    0