-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Hi! I was looking into migrating one of our projects from google-api-services-bigquery to google-cloud-bigquery & noticed a possible regression in the Extract job config in the client library. One of our integration tests extracts data from bigquery-public-data:samples.shakespeare into a GCS bucket in our own GCP project. However, after migrating, the test fails because the extract job can't find the BQ table {OUR_GCP_PROJECT}:samples.shakespeare.
some scala code replicating the issue:
val gcpProject = "some-gcp-project"
val bqClient: com.google.cloud.bigquery.BigQuery = ... // authenticated to $gcpProject
val sourceTableId = TableId.of("bigquery-public-data", "samples", "shakespeare")
val destGcsUri = s"gs://$gcpProject/it/${UUID.randomUUID}"
val config = ExtractJobConfiguration
.newBuilder(sourceTableId, destGcsUri)
.setFormat("AVRO")
val jobInfo = JobInfo.newBuilder(config.build()).build()
print(jobInfo)
val job = bqClient.create(jobInfo).waitFor()
print(job.getStatus.getError)
this prints
JobInfo{job=null, status=null, statistics=null, userEmail=null, etag=null, generatedId=null, selfLink=null, configuration=ExtractJobConfiguration{type=EXTRACT, sourceTable={datasetId=samples, projectId=bigquery-public-data, tableId=shakespeare}, destinationUris=[gs://some-gcp-project/it/9270d23f-de63-41e8-a56a-e7c140297e38], format=AVRO, printHeader=null, fieldDelimiter=null, compression=null}}
and
BigQueryError{reason=notFound, location=null, message=Not found: Table some-gcp-project:samples.shakespeare was not found in location US}[info].
I think the reason is because ExtractJobConfiguration overrides setProjectId to apply the credentialed projectId param specifically to the source table, overriding what it was originally set to.
wdyt? Am I just mis-using the new API?
thanks!