Google Bigquery: Partitioning specification needed for copying date partitioned table

google-bigquery

Note: this is nearly a duplicate of this question with the distinction that in this case, the source table is date partitioned and the destination table does not yet exist. Also, the accepted solution to that question didn't work in this case.

I'm trying to copy a single day's worth of data from one date partitioned table into a new date partitoined table that I have not yet created. My hope is that BigQuery would simply create the date-partitioned destination table for me like it usually does for the non-date-partitioned case.

Using BigQuery CLI, here's my command:

bq cp mydataset.sourcetable\$20161231 mydataset.desttable\$20161231

Here's the output of that command:

BigQuery error in cp operation: Error processing job
'myproject:bqjob_bqjobid': Partitioning specification must be provided
in order to create partitioned table

I've tried doing something similar using the python SDK: running a select command on a date partitioned table (which selects data from only one date partition) and saving the results into a new destination table (which I hope would also be date partitioned). The job fails with the same error:

{u'message': u'Partitioning specification must be provided in order to
create partitioned table', u'reason': u'invalid'}

Clearly I need to add a partitioning specification, but I couldn't find any documentation on how to do so.

Best Answer

You need to create the partitioned destination table first (as per the docs):

If you want to copy a partitioned table into another partitioned table, the partition specifications for the source and destination tables must match.

So, just create the destination partitioned table before you start copying. If you can't be bothered specifying the schema, you can create the destination partitioned table like so:

bq mk --time_partitioning_type=DAY mydataset.temps

Then, use a query instead of a copy to write to the destination table. The schema will be copied with it:

bq query --allow_large_results --replace --destination_table 'mydataset.temps$20160101''SELECT * from `source`'
Related Topic