googleBigqueryTable
Creates a table resource in a dataset for Google BigQuery. For more information see the official documentation and API.
-> Note: On newer versions of the provider, you must explicitly set deletionProtection=false
(and run terraformApply
to write the field to state) in order to destroy an instance. It is recommended to not set this field (or set it to true) until you're ready to destroy.
Example Usage
/*Provider bindings are generated by running cdktf get.
See https://cdk.tf/provider-generation for more details.*/
import * as google from "./.gen/providers/google";
/*The following providers are missing schema information and might need manual adjustments to synthesize correctly: google.
For a more precise conversion please use the --provider flag in convert.*/
const googleBigqueryDatasetDefault = new google.bigqueryDataset.BigqueryDataset(
this,
"default",
{
dataset_id: "foo",
default_table_expiration_ms: 3600000,
description: "This is a test description",
friendly_name: "test",
labels: [
{
env: "default",
},
],
location: "EU",
}
);
const googleBigqueryTableDefault = new google.bigqueryTable.BigqueryTable(
this,
"default_1",
{
dataset_id: googleBigqueryDatasetDefault.datasetId,
labels: [
{
env: "default",
},
],
schema:
'[\n {\n "name": "permalink",\n "type": "STRING",\n "mode": "NULLABLE",\n "description": "The Permalink"\n },\n {\n "name": "state",\n "type": "STRING",\n "mode": "NULLABLE",\n "description": "State where the head office is located"\n }\n]\n',
table_id: "bar",
time_partitioning: [
{
type: "DAY",
},
],
}
);
/*This allows the Terraform resource name to match the original name. You can remove the call if you don't need them to match.*/
googleBigqueryTableDefault.overrideLogicalId("default");
new google.bigqueryTable.BigqueryTable(this, "sheet", {
dataset_id: googleBigqueryDatasetDefault.datasetId,
external_data_configuration: [
{
autodetect: true,
google_sheets_options: [
{
skip_leading_rows: 1,
},
],
source_format: "GOOGLE_SHEETS",
source_uris: ["https://docs.google.com/spreadsheets/d/123456789012345"],
},
],
table_id: "sheet",
});
Argument Reference
The following arguments are supported:
-
datasetId
- (Required) The dataset ID to create the table in. Changing this forces a new resource to be created. -
tableId
- (Required) A unique ID for the resource. Changing this forces a new resource to be created. -
project
- (Optional) The ID of the project in which the resource belongs. If it is not provided, the provider project is used. -
description
- (Optional) The field description. -
expirationTime
- (Optional) The time when this table expires, in milliseconds since the epoch. If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed. -
externalDataConfiguration
- (Optional) Describes the data format, location, and other properties of a table stored outside of BigQuery. By defining these properties, the data source can then be queried as if it were a standard BigQuery table. Structure is documented below. -
friendlyName
- (Optional) A descriptive name for the table. -
encryptionConfiguration
- (Optional) Specifies how the table should be encrypted. If left blank, the table will be encrypted with a Google-managed key; that process is transparent to the user. Structure is documented below. -
labels
- (Optional) A mapping of labels to assign to the resource. -
schema
- (Optional) A JSON schema for the table.\~>NOTE: Because this field expects a JSON string, any changes to the string will create a diff, even if the JSON itself hasn't changed. If the API returns a different value for the same schema, e.g. it switched the order of values or replaced
struct
field type withrecord
field type, we currently cannot suppress the recurring diff this causes. As a workaround, we recommend using the schema as returned by the API.\~>NOTE: When setting
schema
forexternalDataConfiguration
, please useexternalDataConfigurationSchema
documented below. -
timePartitioning
- (Optional) If specified, configures time-based partitioning for this table. Structure is documented below. -
rangePartitioning
- (Optional) If specified, configures range-based partitioning for this table. Structure is documented below. -
clustering
- (Optional) Specifies column names to use for data clustering. Up to four top-level columns are allowed, and should be specified in descending priority order. -
view
- (Optional) If specified, configures this table as a view. Structure is documented below. -
materializedView
- (Optional) If specified, configures this table as a materialized view. Structure is documented below. -
deletionProtection
- (Optional) Whether or not to allow Terraform to destroy the instance. Unless this field is set to false in Terraform state, aterraformDestroy
orterraformApply
that would delete the instance will fail.
The externalDataConfiguration
block supports:
-
autodetect
- (Required) - Let BigQuery try to autodetect the schema and format of the table. -
compression
(Optional) - The compression type of the data source. Valid values are "NONE" or "GZIP". -
connectionId
(Optional) - The connection specifying the credentials to be used to read external storage, such as Azure Blob, Cloud Storage, or S3. TheconnectionId
can have the form{{project}}.{{location}}.{{connectionId}}
orprojects/{{project}}/locations/{{location}}/connections/{{connectionId}}
. -
csvOptions
(Optional) - Additional properties to set ifsourceFormat
is set to "CSV". Structure is documented below. -
googleSheetsOptions
(Optional) - Additional options ifsourceFormat
is set to "GOOGLE_SHEETS". Structure is documented below. -
hivePartitioningOptions
(Optional) - When set, configures hive partitioning support. Not all storage formats support hive partitioning -- requesting hive partitioning on an unsupported format will lead to an error, as will providing an invalid specification. Structure is documented below. -
avroOptions
(Optional) - Additional options ifsourceFormat
is set to\ "AVRO". Structure is documented below. -
ignoreUnknownValues
(Optional) - Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. -
maxBadRecords
(Optional) - The maximum number of bad records that BigQuery can ignore when reading data. -
schema
- (Optional) A JSON schema for the external table. Schema is required for CSV and JSON formats if autodetect is not on. Schema is disallowed for Google Cloud Bigtable, Cloud Datastore backups, Avro, ORC and Parquet formats. \~>NOTE: Because this field expects a JSON string, any changes to the string will create a diff, even if the JSON itself hasn't changed. Furthermore drift for this field cannot not be detected because BigQuery only uses this schema to compute the effective schema for the table, therefore any changes on the configured value will force the table to be recreated. This schema is effectively only applied when creating a table from an external datasource, after creation the computed schema will be stored ingoogleBigqueryTableSchema
-
sourceFormat
(Required) - The data format. Supported values are: "CSV", "GOOGLE_SHEETS", "NEWLINE_DELIMITED_JSON", "AVRO", "PARQUET", "ORC", "DATSTORE_BACKUP", and "BIGTABLE". To use "GOOGLE_SHEETS" thescopes
must include "https://www.googleapis.com/auth/drive.readonly". -
sourceUris
- (Required) A list of the fully-qualified URIs that point to your data in Google Cloud. -
referenceFileSchemaUri
- (Optional) When creating an external table, the user can provide a reference file with the table schema. This is enabled for the following formats: AVRO, PARQUET, ORC.
The csvOptions
block supports:
-
quote
(Required) - The value that is used to quote data sections in a CSV file. If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set theallowQuotedNewlines
property to true. The API-side default is"
, specified in Terraform escaped as\"
. Due to limitations with Terraform default values, this value is required to be explicitly set. -
allowJaggedRows
(Optional) - Indicates if BigQuery should accept rows that are missing trailing optional columns. -
allowQuotedNewlines
(Optional) - Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false. -
encoding
(Optional) - The character encoding of the data. The supported values are UTF-8 or ISO-8859-1. -
fieldDelimiter
(Optional) - The separator for fields in a CSV file. -
skipLeadingRows
(Optional) - The number of rows at the top of a CSV file that BigQuery will skip when reading the data.
The googleSheetsOptions
block supports:
-
range
(Optional) - Range of a sheet to query from. Only used when non-empty. At least one ofrange
orskipLeadingRows
must be set. Typical format: "sheet_name!top_left_cell_id:bottom_right_cell_id" For example: "sheet1!A1:B20" -
skipLeadingRows
(Optional) - The number of rows at the top of the sheet that BigQuery will skip when reading the data. At least one ofrange
orskipLeadingRows
must be set.
The hivePartitioningOptions
block supports:
-
mode
(Optional) - When set, what mode of hive partitioning to use when reading data. The following modes are supported.- AUTO: automatically infer partition key name(s) and type(s).
- STRINGS: automatically infer partition key name(s). All types are Not all storage formats support hive partitioning. Requesting hive partitioning on an unsupported format will lead to an error. Currently supported formats are: JSON, CSV, ORC, Avro and Parquet.
- CUSTOM: when set to
custom
, you must encode the partition key schema within thesourceUriPrefix
by settingsourceUriPrefix
togs://bucket/pathToTable/{key1:type1}/{key2:type2}/{key3:type3}
.
-
requirePartitionFilter
- (Optional) If set to true, queries over this table require a partition filter that can be used for partition elimination to be specified. -
sourceUriPrefix
(Optional) - When hive partition detection is requested, a common for all source uris must be required. The prefix must end immediately before the partition key encoding begins. For example, consider files following this data layout.gs://bucket/pathToTable/dt=20190601/country=usa/id=7/fileAvro
gs://bucket/pathToTable/dt=20190531/country=ca/id=3/fileAvro
When hive partitioning is requested with either AUTO or STRINGS detection, the common prefix can be either ofgs://bucket/pathToTable
orgs://bucket/pathToTable/
. Note that whenmode
is set tocustom
, you must encode the partition key schema within thesourceUriPrefix
by settingsourceUriPrefix
togs://bucket/pathToTable/{key1:type1}/{key2:type2}/{key3:type3}
.
The avroOptions
block supports:
useAvroLogicalTypes
(Optional) - If is set to true, indicates whether\ to interpret logical types as the corresponding BigQuery data type\ (for example, TIMESTAMP), instead of using the raw type (for example, INTEGER).
The timePartitioning
block supports:
-
expirationMs
- (Optional) Number of milliseconds for which to keep the storage for a partition. -
field
- (Optional) The field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, the table is partitioned based on the load time. -
type
- (Required) The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively. -
requirePartitionFilter
- (Optional) If set to true, queries over this table require a partition filter that can be used for partition elimination to be specified.
The rangePartitioning
block supports:
-
field
- (Required) The field used to determine how to create a range-based partition. -
range
- (Required) Information required to partition based on ranges. Structure is documented below.
-
start
- (Required) Start of the range partitioning, inclusive. -
end
- (Required) End of the range partitioning, exclusive. -
interval
- (Required) The width of each range within the partition.
-
query
- (Required) A query that BigQuery executes when the view is referenced. -
useLegacySql
- (Optional) Specifies whether to use BigQuery's legacy SQL for this view. The default value is true. If set to false, the view will use BigQuery's standard SQL.
The materializedView
block supports:
-
query
- (Required) A query whose result is persisted. -
enableRefresh
- (Optional) Specifies whether to use BigQuery's automatic refresh for this materialized view when the base table is updated. The default value is true. -
refreshIntervalMs
- (Optional) The maximum frequency at which this materialized view will be refreshed. The default value is 1800000
The encryptionConfiguration
block supports the following arguments:
kmsKeyName
- (Required) The self link or full name of a key which should be used to encrypt this table. Note that the default bigquery service account will need to have encrypt/decrypt permissions on this key - you may want to see thegoogleBigqueryDefaultServiceAccount
datasource and thegoogleKmsCryptoKeyIamBinding
resource.
Attributes Reference
In addition to the arguments listed above, the following computed attributes are exported:
-
id
- an identifier for the resource with formatprojects/{{project}}/datasets/{{dataset}}/tables/{{name}}
-
creationTime
- The time when this table was created, in milliseconds since the epoch. -
etag
- A hash of the resource. -
kmsKeyVersion
- The self link or full name of the kms key version used to encrypt this table. -
lastModifiedTime
- The time when this table was last modified, in milliseconds since the epoch. -
location
- The geographic location where the table resides. This value is inherited from the dataset. -
numBytes
- The size of this table in bytes, excluding any data in the streaming buffer. -
numLongTermBytes
- The number of bytes in the table that are considered "long-term storage". -
numRows
- The number of rows of data in this table, excluding any data in the streaming buffer. -
selfLink
- The URI of the created resource. -
type
- Describes the table type.
Import
BigQuery tables imported using any of these accepted formats: