Skip to content

Resource: awsSagemakerEndpointConfiguration

Provides a SageMaker endpoint configuration resource.

Example Usage

Basic usage:

/*Provider bindings are generated by running cdktf get.
See https://cdk.tf/provider-generation for more details.*/
import * as aws from "./.gen/providers/aws";
new aws.sagemakerEndpointConfiguration.SagemakerEndpointConfiguration(
  this,
  "ec",
  {
    name: "my-endpoint-config",
    productionVariants: [
      {
        initialInstanceCount: 1,
        instanceType: "ml.t2.medium",
        modelName: "${aws_sagemaker_model.m.name}",
        variantName: "variant-1",
      },
    ],
    tags: {
      Name: "foo",
    },
  }
);

Argument Reference

The following arguments are supported:

  • productionVariants - (Required) An list of ProductionVariant objects, one for each model that you want to host at this endpoint. Fields are documented below.
  • kmsKeyArn - (Optional) Amazon Resource Name (ARN) of a AWS Key Management Service key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance that hosts the endpoint.
  • name - (Optional) The name of the endpoint configuration. If omitted, Terraform will assign a random, unique name.
  • tags - (Optional) A mapping of tags to assign to the resource. If configured with a provider defaultTags configuration block present, tags with matching keys will overwrite those defined at the provider-level.
  • dataCaptureConfig - (Optional) Specifies the parameters to capture input/output of SageMaker models endpoints. Fields are documented below.
  • asyncInferenceConfig - (Optional) Specifies configuration for how an endpoint performs asynchronous inference.
  • shadowProductionVariants - (Optional) Array of ProductionVariant objects. There is one for each model that you want to host at this endpoint in shadow mode with production traffic replicated from the model specified on ProductionVariants.If you use this field, you can only specify one variant for ProductionVariants and one variant for ShadowProductionVariants. Fields are documented below.

productionVariants

  • acceleratorType - (Optional) The size of the Elastic Inference (EI) instance to use for the production variant.
  • containerStartupHealthCheckTimeoutInSeconds - (Optional) The timeout value, in seconds, for your inference container to pass health check by SageMaker Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests. Valid values between 60 and 3600.
  • coreDumpConfig - (Optional) Specifies configuration for a core dump from the model container when the process crashes. Fields are documented below.
  • initialInstanceCount - (Optional) Initial number of instances used for auto-scaling.
  • instanceType - (Optional) The type of instance to start.
  • initialVariantWeight - (Optional) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. If unspecified, it defaults to 10.
  • modelDataDownloadTimeoutInSeconds - (Optional) The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this production variant. Valid values between 60 and 3600.
  • modelName - (Required) The name of the model to use.
  • serverlessConfig - (Optional) Specifies configuration for how an endpoint performs asynchronous inference.
  • variantName - (Optional) The name of the variant. If omitted, Terraform will assign a random, unique name.
  • volumeSizeInGb - (Optional) The size, in GB, of the ML storage volume attached to individual inference instance associated with the production variant. Valid values between 1 and 512.

coreDumpConfig

  • destinationS3Uri - (Required) The Amazon S3 bucket to send the core dump to.
  • kmsKeyId - (Required) The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that SageMaker uses to encrypt the core dump data at rest using Amazon S3 server-side encryption.

serverlessConfig

  • maxConcurrency - (Required) The maximum number of concurrent invocations your serverless endpoint can process. Valid values are between 1 and 200.
  • memorySizeInMb - (Required) The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.

dataCaptureConfig

  • initialSamplingPercentage - (Required) Portion of data to capture. Should be between 0 and 100.
  • destinationS3Uri - (Required) The URL for S3 location where the captured data is stored.
  • captureOptions - (Required) Specifies what data to capture. Fields are documented below.
  • kmsKeyId - (Optional) Amazon Resource Name (ARN) of a AWS Key Management Service key that Amazon SageMaker uses to encrypt the captured data on Amazon S3.
  • enableCapture - (Optional) Flag to enable data capture. Defaults to false.
  • captureContentTypeHeader - (Optional) The content type headers to capture. Fields are documented below.

captureOptions

  • captureMode - (Required) Specifies the data to be captured. Should be one of input or output.

captureContentTypeHeader

  • csvContentTypes - (Optional) The CSV content type headers to capture.
  • jsonContentTypes - (Optional) The JSON content type headers to capture.

asyncInferenceConfig

  • outputConfig - (Required) Specifies the configuration for asynchronous inference invocation outputs.
  • clientConfig - (Optional) Configures the behavior of the client used by Amazon SageMaker to interact with the model container during asynchronous inference.

clientConfig

  • maxConcurrentInvocationsPerInstance - (Optional) The maximum number of concurrent requests sent by the SageMaker client to the model container. If no value is provided, Amazon SageMaker will choose an optimal value for you.

outputConfig

  • s3OutputPath - (Required) The Amazon S3 location to upload inference responses to.
  • kmsKeyId - (Optional) The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that Amazon SageMaker uses to encrypt the asynchronous inference output in Amazon S3.
  • notificationConfig - (Optional) Specifies the configuration for notifications of inference results for asynchronous inference.
notificationConfig
  • errorTopic - (Optional) Amazon SNS topic to post a notification to when inference fails. If no topic is provided, no notification is sent on failure.
  • successTopic - (Optional) Amazon SNS topic to post a notification to when inference completes successfully. If no topic is provided, no notification is sent on success.

Attributes Reference

In addition to all arguments above, the following attributes are exported:

  • arn - The Amazon Resource Name (ARN) assigned by AWS to this endpoint configuration.
  • name - The name of the endpoint configuration.
  • tagsAll - A map of tags assigned to the resource, including those inherited from the provider defaultTags configuration block.

Import

Endpoint configurations can be imported using the name, e.g.,

$ terraform import aws_sagemaker_endpoint_configuration.test_endpoint_config endpoint-config-foo