Skip to content

azurermDataFactoryLinkedServiceAzureDatabricks

Manages a Linked Service (connection) between Azure Databricks and Azure Data Factory.

Example Usage with managed identity & new cluster

/*Provider bindings are generated by running cdktf get.
See https://cdk.tf/provider-generation for more details.*/
import * as azurerm from "./.gen/providers/azurerm";
/*The following providers are missing schema information and might need manual adjustments to synthesize correctly: azurerm.
For a more precise conversion please use the --provider flag in convert.*/
const azurermResourceGroupExample = new azurerm.resourceGroup.ResourceGroup(
  this,
  "example",
  {
    location: "East US",
    name: "example",
  }
);
const azurermDataFactoryExample = new azurerm.dataFactory.DataFactory(
  this,
  "example_1",
  {
    identity: [
      {
        type: "SystemAssigned",
      },
    ],
    location: azurermResourceGroupExample.location,
    name: "TestDtaFactory92783401247",
    resource_group_name: azurermResourceGroupExample.name,
  }
);
/*This allows the Terraform resource name to match the original name. You can remove the call if you don't need them to match.*/
azurermDataFactoryExample.overrideLogicalId("example");
const azurermDatabricksWorkspaceExample =
  new azurerm.databricksWorkspace.DatabricksWorkspace(this, "example_2", {
    location: azurermResourceGroupExample.location,
    name: "databricks-test",
    resource_group_name: azurermResourceGroupExample.name,
    sku: "standard",
  });
/*This allows the Terraform resource name to match the original name. You can remove the call if you don't need them to match.*/
azurermDatabricksWorkspaceExample.overrideLogicalId("example");
new azurerm.dataFactoryLinkedServiceAzureDatabricks.DataFactoryLinkedServiceAzureDatabricks(
  this,
  "msi_linked",
  {
    adb_domain: `https://\${${azurermDatabricksWorkspaceExample.workspaceUrl}}`,
    data_factory_id: azurermDataFactoryExample.id,
    description: "ADB Linked Service via MSI",
    msi_work_space_resource_id: azurermDatabricksWorkspaceExample.id,
    name: "ADBLinkedServiceViaMSI",
    new_cluster_config: [
      {
        cluster_version: "5.5.x-gpu-scala2.11",
        custom_tags: [
          {
            custom_tag1: "sct_value_1",
            custom_tag2: "sct_value_2",
          },
        ],
        driver_node_type: "Standard_NC12",
        init_scripts: ["init.sh", "init2.sh"],
        log_destination: "dbfs:/logs",
        max_number_of_workers: 5,
        min_number_of_workers: 1,
        node_type: "Standard_NC12",
        spark_config: [
          {
            config1: "value1",
            config2: "value2",
          },
        ],
        spark_environment_variables: [
          {
            envVar1: "value1",
            envVar2: "value2",
          },
        ],
      },
    ],
  }
);

Example Usage with access token & existing cluster

/*Provider bindings are generated by running cdktf get.
See https://cdk.tf/provider-generation for more details.*/
import * as azurerm from "./.gen/providers/azurerm";
/*The following providers are missing schema information and might need manual adjustments to synthesize correctly: azurerm.
For a more precise conversion please use the --provider flag in convert.*/
const azurermResourceGroupExample = new azurerm.resourceGroup.ResourceGroup(
  this,
  "example",
  {
    location: "East US",
    name: "example",
  }
);
const azurermDataFactoryExample = new azurerm.dataFactory.DataFactory(
  this,
  "example_1",
  {
    location: azurermResourceGroupExample.location,
    name: "TestDtaFactory92783401247",
    resource_group_name: azurermResourceGroupExample.name,
  }
);
/*This allows the Terraform resource name to match the original name. You can remove the call if you don't need them to match.*/
azurermDataFactoryExample.overrideLogicalId("example");
const azurermDatabricksWorkspaceExample =
  new azurerm.databricksWorkspace.DatabricksWorkspace(this, "example_2", {
    location: azurermResourceGroupExample.location,
    name: "databricks-test",
    resource_group_name: azurermResourceGroupExample.name,
    sku: "standard",
  });
/*This allows the Terraform resource name to match the original name. You can remove the call if you don't need them to match.*/
azurermDatabricksWorkspaceExample.overrideLogicalId("example");
new azurerm.dataFactoryLinkedServiceAzureDatabricks.DataFactoryLinkedServiceAzureDatabricks(
  this,
  "at_linked",
  {
    access_token: "SomeDatabricksAccessToken",
    adb_domain: `https://\${${azurermDatabricksWorkspaceExample.workspaceUrl}}`,
    data_factory_id: azurermDataFactoryExample.id,
    description: "ADB Linked Service via Access Token",
    existing_cluster_id: "0308-201146-sly615",
    name: "ADBLinkedServiceViaAccessToken",
  }
);

Arguments Reference

The following arguments are supported:

  • adbDomain - (Required) The domain URL of the databricks instance.

  • dataFactoryId - (Required) The Data Factory ID in which to associate the Linked Service with. Changing this forces a new resource.

  • name - (Required) Specifies the name of the Data Factory Linked Service. Changing this forces a new resource to be created. Must be unique within a data factory. See the Microsoft documentation for all restrictions.


You must specify exactly one of the following authentication blocks:

  • accessToken - (Optional) Authenticate to ADB via an access token.

  • keyVaultPassword - (Optional) Authenticate to ADB via Azure Key Vault Linked Service as defined in the keyVaultPassword block below.

  • msiWorkSpaceResourceId - (Optional) Authenticate to ADB via managed service identity.


You must specify exactly one of the following modes for cluster integration:

  • existingClusterId - (Optional) The cluster_id of an existing cluster within the linked ADB instance.

  • instancePool - (Optional) Leverages an instance pool within the linked ADB instance as defined by instancePool block below.

  • newClusterConfig - (Optional) Creates new clusters within the linked ADB instance as defined in the newClusterConfig block below.


  • additionalProperties - (Optional) A map of additional properties to associate with the Data Factory Linked Service.

  • annotations - (Optional) List of tags that can be used for describing the Data Factory Linked Service.

  • description - (Optional) The description for the Data Factory Linked Service.

  • integrationRuntimeName - (Optional) The integration runtime reference to associate with the Data Factory Linked Service.

  • parameters - (Optional) A map of parameters to associate with the Data Factory Linked Service.


A keyVaultPassword block supports the following:

  • linkedServiceName - (Required) Specifies the name of an existing Key Vault Data Factory Linked Service.

  • secretName - (Required) Specifies the secret name in Azure Key Vault that stores ADB access token.


A newClusterConfig block supports the following:

  • clusterVersion - (Required) Spark version of a the cluster.

  • nodeType - (Required) Node type for the new cluster.

  • customTags - (Optional) Tags for the cluster resource.

  • driverNodeType - (Optional) Driver node type for the cluster.

  • initScripts - (Optional) User defined initialization scripts for the cluster.

  • logDestination - (Optional) Location to deliver Spark driver, worker, and event logs.

  • maxNumberOfWorkers - (Optional) Specifies the maximum number of worker nodes. It should be between 1 and 25000.

  • minNumberOfWorkers - (Optional) Specifies the minimum number of worker nodes. It should be between 1 and 25000. It defaults to 1.

  • sparkConfig - (Optional) User-specified Spark configuration variables key-value pairs.

  • sparkEnvironmentVariables - (Optional) User-specified Spark environment variables key-value pairs.


A instancePool block supports the following:

  • instancePoolId - (Required) Identifier of the instance pool within the linked ADB instance.

  • clusterVersion - (Required) Spark version of a the cluster.

  • minNumberOfWorkers - (Optional) The minimum number of worker nodes. Defaults to 1.

  • maxNumberOfWorkers - (Optional) The max number of worker nodes. Set this value if you want to enable autoscaling between the minNumberOfWorkers and this value. Omit this value to use a fixed number of workers defined in the minNumberOfWorkers property.


Attributes Reference

In addition to the Arguments listed above - the following Attributes are exported:

  • id - The ID of the Data Factory Linked Service.

Timeouts

The timeouts block allows you to specify timeouts for certain actions:

  • create - (Defaults to 30 minutes) Used when creating the Data Factory Linked Service.
  • read - (Defaults to 5 minutes) Used when retrieving the Data Factory Linked Service.
  • update - (Defaults to 30 minutes) Used when updating the Data Factory Linked Service.
  • delete - (Defaults to 30 minutes) Used when deleting the Data Factory Linked Service.

Import

Data Factory Linked Services can be imported using the resourceId, e.g.

terraform import azurerm_data_factory_linked_service_azure_databricks.example /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/example/providers/Microsoft.DataFactory/factories/example/linkedservices/example