Skip to content

googleBigqueryTable

Creates a table resource in a dataset for Google BigQuery. For more information see the official documentation and API.

-> Note: On newer versions of the provider, you must explicitly set deletionProtection=false (and run terraformApply to write the field to state) in order to destroy an instance. It is recommended to not set this field (or set it to true) until you're ready to destroy.

Example Usage

/*Provider bindings are generated by running cdktf get.
See https://cdk.tf/provider-generation for more details.*/
import * as google from "./.gen/providers/google";
/*The following providers are missing schema information and might need manual adjustments to synthesize correctly: google.
For a more precise conversion please use the --provider flag in convert.*/
const googleBigqueryDatasetDefault = new google.bigqueryDataset.BigqueryDataset(
  this,
  "default",
  {
    dataset_id: "foo",
    default_table_expiration_ms: 3600000,
    description: "This is a test description",
    friendly_name: "test",
    labels: [
      {
        env: "default",
      },
    ],
    location: "EU",
  }
);
const googleBigqueryTableDefault = new google.bigqueryTable.BigqueryTable(
  this,
  "default_1",
  {
    dataset_id: googleBigqueryDatasetDefault.datasetId,
    labels: [
      {
        env: "default",
      },
    ],
    schema:
      '[\n  {\n    "name": "permalink",\n    "type": "STRING",\n    "mode": "NULLABLE",\n    "description": "The Permalink"\n  },\n  {\n    "name": "state",\n    "type": "STRING",\n    "mode": "NULLABLE",\n    "description": "State where the head office is located"\n  }\n]\n',
    table_id: "bar",
    time_partitioning: [
      {
        type: "DAY",
      },
    ],
  }
);
/*This allows the Terraform resource name to match the original name. You can remove the call if you don't need them to match.*/
googleBigqueryTableDefault.overrideLogicalId("default");
new google.bigqueryTable.BigqueryTable(this, "sheet", {
  dataset_id: googleBigqueryDatasetDefault.datasetId,
  external_data_configuration: [
    {
      autodetect: true,
      google_sheets_options: [
        {
          skip_leading_rows: 1,
        },
      ],
      source_format: "GOOGLE_SHEETS",
      source_uris: ["https://docs.google.com/spreadsheets/d/123456789012345"],
    },
  ],
  table_id: "sheet",
});

Argument Reference

The following arguments are supported:

  • datasetId - (Required) The dataset ID to create the table in. Changing this forces a new resource to be created.

  • tableId - (Required) A unique ID for the resource. Changing this forces a new resource to be created.

  • project - (Optional) The ID of the project in which the resource belongs. If it is not provided, the provider project is used.

  • description - (Optional) The field description.

  • expirationTime - (Optional) The time when this table expires, in milliseconds since the epoch. If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.

  • externalDataConfiguration - (Optional) Describes the data format, location, and other properties of a table stored outside of BigQuery. By defining these properties, the data source can then be queried as if it were a standard BigQuery table. Structure is documented below.

  • friendlyName - (Optional) A descriptive name for the table.

  • encryptionConfiguration - (Optional) Specifies how the table should be encrypted. If left blank, the table will be encrypted with a Google-managed key; that process is transparent to the user. Structure is documented below.

  • labels - (Optional) A mapping of labels to assign to the resource.

  • schema - (Optional) A JSON schema for the table.

    \~>NOTE: Because this field expects a JSON string, any changes to the string will create a diff, even if the JSON itself hasn't changed. If the API returns a different value for the same schema, e.g. it switched the order of values or replaced struct field type with record field type, we currently cannot suppress the recurring diff this causes. As a workaround, we recommend using the schema as returned by the API.

    \~>NOTE: When setting schema for externalDataConfiguration, please use externalDataConfigurationSchema documented below.

  • timePartitioning - (Optional) If specified, configures time-based partitioning for this table. Structure is documented below.

  • rangePartitioning - (Optional) If specified, configures range-based partitioning for this table. Structure is documented below.

  • clustering - (Optional) Specifies column names to use for data clustering. Up to four top-level columns are allowed, and should be specified in descending priority order.

  • view - (Optional) If specified, configures this table as a view. Structure is documented below.

  • materializedView - (Optional) If specified, configures this table as a materialized view. Structure is documented below.

  • deletionProtection - (Optional) Whether or not to allow Terraform to destroy the instance. Unless this field is set to false in Terraform state, a terraformDestroy or terraformApply that would delete the instance will fail.

The externalDataConfiguration block supports:

  • autodetect - (Required) - Let BigQuery try to autodetect the schema and format of the table.

  • compression (Optional) - The compression type of the data source. Valid values are "NONE" or "GZIP".

  • connectionId (Optional) - The connection specifying the credentials to be used to read external storage, such as Azure Blob, Cloud Storage, or S3. The connectionId can have the form {{project}}.{{location}}.{{connectionId}} or projects/{{project}}/locations/{{location}}/connections/{{connectionId}}.

  • csvOptions (Optional) - Additional properties to set if sourceFormat is set to "CSV". Structure is documented below.

  • googleSheetsOptions (Optional) - Additional options if sourceFormat is set to "GOOGLE_SHEETS". Structure is documented below.

  • hivePartitioningOptions (Optional) - When set, configures hive partitioning support. Not all storage formats support hive partitioning -- requesting hive partitioning on an unsupported format will lead to an error, as will providing an invalid specification. Structure is documented below.

  • avroOptions (Optional) - Additional options if sourceFormat is set to\ "AVRO". Structure is documented below.

  • ignoreUnknownValues (Optional) - Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false.

  • maxBadRecords (Optional) - The maximum number of bad records that BigQuery can ignore when reading data.

  • schema - (Optional) A JSON schema for the external table. Schema is required for CSV and JSON formats if autodetect is not on. Schema is disallowed for Google Cloud Bigtable, Cloud Datastore backups, Avro, ORC and Parquet formats. \~>NOTE: Because this field expects a JSON string, any changes to the string will create a diff, even if the JSON itself hasn't changed. Furthermore drift for this field cannot not be detected because BigQuery only uses this schema to compute the effective schema for the table, therefore any changes on the configured value will force the table to be recreated. This schema is effectively only applied when creating a table from an external datasource, after creation the computed schema will be stored in googleBigqueryTableSchema

  • sourceFormat (Required) - The data format. Supported values are: "CSV", "GOOGLE_SHEETS", "NEWLINE_DELIMITED_JSON", "AVRO", "PARQUET", "ORC", "DATSTORE_BACKUP", and "BIGTABLE". To use "GOOGLE_SHEETS" the scopes must include "https://www.googleapis.com/auth/drive.readonly".

  • sourceUris - (Required) A list of the fully-qualified URIs that point to your data in Google Cloud.

  • referenceFileSchemaUri - (Optional) When creating an external table, the user can provide a reference file with the table schema. This is enabled for the following formats: AVRO, PARQUET, ORC.

The csvOptions block supports:

  • quote (Required) - The value that is used to quote data sections in a CSV file. If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true. The API-side default is ", specified in Terraform escaped as \". Due to limitations with Terraform default values, this value is required to be explicitly set.

  • allowJaggedRows (Optional) - Indicates if BigQuery should accept rows that are missing trailing optional columns.

  • allowQuotedNewlines (Optional) - Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false.

  • encoding (Optional) - The character encoding of the data. The supported values are UTF-8 or ISO-8859-1.

  • fieldDelimiter (Optional) - The separator for fields in a CSV file.

  • skipLeadingRows (Optional) - The number of rows at the top of a CSV file that BigQuery will skip when reading the data.

The googleSheetsOptions block supports:

  • range (Optional) - Range of a sheet to query from. Only used when non-empty. At least one of range or skipLeadingRows must be set. Typical format: "sheet_name!top_left_cell_id:bottom_right_cell_id" For example: "sheet1!A1:B20"

  • skipLeadingRows (Optional) - The number of rows at the top of the sheet that BigQuery will skip when reading the data. At least one of range or skipLeadingRows must be set.

The hivePartitioningOptions block supports:

  • mode (Optional) - When set, what mode of hive partitioning to use when reading data. The following modes are supported.

    • AUTO: automatically infer partition key name(s) and type(s).
    • STRINGS: automatically infer partition key name(s). All types are Not all storage formats support hive partitioning. Requesting hive partitioning on an unsupported format will lead to an error. Currently supported formats are: JSON, CSV, ORC, Avro and Parquet.
    • CUSTOM: when set to custom, you must encode the partition key schema within the sourceUriPrefix by setting sourceUriPrefix to gs://bucket/pathToTable/{key1:type1}/{key2:type2}/{key3:type3}.
  • requirePartitionFilter - (Optional) If set to true, queries over this table require a partition filter that can be used for partition elimination to be specified.

  • sourceUriPrefix (Optional) - When hive partition detection is requested, a common for all source uris must be required. The prefix must end immediately before the partition key encoding begins. For example, consider files following this data layout. gs://bucket/pathToTable/dt=20190601/country=usa/id=7/fileAvro gs://bucket/pathToTable/dt=20190531/country=ca/id=3/fileAvro When hive partitioning is requested with either AUTO or STRINGS detection, the common prefix can be either of gs://bucket/pathToTable or gs://bucket/pathToTable/. Note that when mode is set to custom, you must encode the partition key schema within the sourceUriPrefix by setting sourceUriPrefix to gs://bucket/pathToTable/{key1:type1}/{key2:type2}/{key3:type3}.

The avroOptions block supports:

  • useAvroLogicalTypes (Optional) - If is set to true, indicates whether\ to interpret logical types as the corresponding BigQuery data type\ (for example, TIMESTAMP), instead of using the raw type (for example, INTEGER).

The timePartitioning block supports:

  • expirationMs - (Optional) Number of milliseconds for which to keep the storage for a partition.

  • field - (Optional) The field used to determine how to create a time-based partition. If time-based partitioning is enabled without this value, the table is partitioned based on the load time.

  • type - (Required) The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively.

  • requirePartitionFilter - (Optional) If set to true, queries over this table require a partition filter that can be used for partition elimination to be specified.

The rangePartitioning block supports:

  • field - (Required) The field used to determine how to create a range-based partition.

  • range - (Required) Information required to partition based on ranges. Structure is documented below.

The range block supports:

  • start - (Required) Start of the range partitioning, inclusive.

  • end - (Required) End of the range partitioning, exclusive.

  • interval - (Required) The width of each range within the partition.

The view block supports:

  • query - (Required) A query that BigQuery executes when the view is referenced.

  • useLegacySql - (Optional) Specifies whether to use BigQuery's legacy SQL for this view. The default value is true. If set to false, the view will use BigQuery's standard SQL.

The materializedView block supports:

  • query - (Required) A query whose result is persisted.

  • enableRefresh - (Optional) Specifies whether to use BigQuery's automatic refresh for this materialized view when the base table is updated. The default value is true.

  • refreshIntervalMs - (Optional) The maximum frequency at which this materialized view will be refreshed. The default value is 1800000

The encryptionConfiguration block supports the following arguments:

  • kmsKeyName - (Required) The self link or full name of a key which should be used to encrypt this table. Note that the default bigquery service account will need to have encrypt/decrypt permissions on this key - you may want to see the googleBigqueryDefaultServiceAccount datasource and the googleKmsCryptoKeyIamBinding resource.

Attributes Reference

In addition to the arguments listed above, the following computed attributes are exported:

  • id - an identifier for the resource with format projects/{{project}}/datasets/{{dataset}}/tables/{{name}}

  • creationTime - The time when this table was created, in milliseconds since the epoch.

  • etag - A hash of the resource.

  • kmsKeyVersion - The self link or full name of the kms key version used to encrypt this table.

  • lastModifiedTime - The time when this table was last modified, in milliseconds since the epoch.

  • location - The geographic location where the table resides. This value is inherited from the dataset.

  • numBytes - The size of this table in bytes, excluding any data in the streaming buffer.

  • numLongTermBytes - The number of bytes in the table that are considered "long-term storage".

  • numRows - The number of rows of data in this table, excluding any data in the streaming buffer.

  • selfLink - The URI of the created resource.

  • type - Describes the table type.

Import

BigQuery tables imported using any of these accepted formats:

$ terraform import google_bigquery_table.default projects/{{project}}/datasets/{{dataset_id}}/tables/{{table_id}}
$ terraform import google_bigquery_table.default {{project}}/{{dataset_id}}/{{table_id}}
$ terraform import google_bigquery_table.default {{dataset_id}}/{{table_id}}