Resource: awsSagemakerEndpointConfiguration
Provides a SageMaker endpoint configuration resource.
Example Usage
Basic usage:
/*Provider bindings are generated by running cdktf get.
See https://cdk.tf/provider-generation for more details.*/
import * as aws from "./.gen/providers/aws";
new aws.sagemakerEndpointConfiguration.SagemakerEndpointConfiguration(
this,
"ec",
{
name: "my-endpoint-config",
productionVariants: [
{
initialInstanceCount: 1,
instanceType: "ml.t2.medium",
modelName: "${aws_sagemaker_model.m.name}",
variantName: "variant-1",
},
],
tags: {
Name: "foo",
},
}
);
Argument Reference
The following arguments are supported:
productionVariants
- (Required) An list of ProductionVariant objects, one for each model that you want to host at this endpoint. Fields are documented below.kmsKeyArn
- (Optional) Amazon Resource Name (ARN) of a AWS Key Management Service key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance that hosts the endpoint.name
- (Optional) The name of the endpoint configuration. If omitted, Terraform will assign a random, unique name.tags
- (Optional) A mapping of tags to assign to the resource. If configured with a providerdefaultTags
configuration block present, tags with matching keys will overwrite those defined at the provider-level.dataCaptureConfig
- (Optional) Specifies the parameters to capture input/output of SageMaker models endpoints. Fields are documented below.asyncInferenceConfig
- (Optional) Specifies configuration for how an endpoint performs asynchronous inference.shadowProductionVariants
- (Optional) Array of ProductionVariant objects. There is one for each model that you want to host at this endpoint in shadow mode with production traffic replicated from the model specified on ProductionVariants.If you use this field, you can only specify one variant for ProductionVariants and one variant for ShadowProductionVariants. Fields are documented below.
productionVariants
acceleratorType
- (Optional) The size of the Elastic Inference (EI) instance to use for the production variant.containerStartupHealthCheckTimeoutInSeconds
- (Optional) The timeout value, in seconds, for your inference container to pass health check by SageMaker Hosting. For more information about health check, see How Your Container Should Respond to Health Check (Ping) Requests. Valid values between60
and3600
.coreDumpConfig
- (Optional) Specifies configuration for a core dump from the model container when the process crashes. Fields are documented below.initialInstanceCount
- (Optional) Initial number of instances used for auto-scaling.instanceType
- (Optional) The type of instance to start.initialVariantWeight
- (Optional) Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. If unspecified, it defaults to10
.modelDataDownloadTimeoutInSeconds
- (Optional) The timeout value, in seconds, to download and extract the model that you want to host from Amazon S3 to the individual inference instance associated with this production variant. Valid values between60
and3600
.modelName
- (Required) The name of the model to use.serverlessConfig
- (Optional) Specifies configuration for how an endpoint performs asynchronous inference.variantName
- (Optional) The name of the variant. If omitted, Terraform will assign a random, unique name.volumeSizeInGb
- (Optional) The size, in GB, of the ML storage volume attached to individual inference instance associated with the production variant. Valid values between1
and512
.
coreDumpConfig
destinationS3Uri
- (Required) The Amazon S3 bucket to send the core dump to.kmsKeyId
- (Required) The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that SageMaker uses to encrypt the core dump data at rest using Amazon S3 server-side encryption.
serverlessConfig
maxConcurrency
- (Required) The maximum number of concurrent invocations your serverless endpoint can process. Valid values are between1
and200
.memorySizeInMb
- (Required) The memory size of your serverless endpoint. Valid values are in 1 GB increments:1024
MB,2048
MB,3072
MB,4096
MB,5120
MB, or6144
MB.
dataCaptureConfig
initialSamplingPercentage
- (Required) Portion of data to capture. Should be between 0 and 100.destinationS3Uri
- (Required) The URL for S3 location where the captured data is stored.captureOptions
- (Required) Specifies what data to capture. Fields are documented below.kmsKeyId
- (Optional) Amazon Resource Name (ARN) of a AWS Key Management Service key that Amazon SageMaker uses to encrypt the captured data on Amazon S3.enableCapture
- (Optional) Flag to enable data capture. Defaults tofalse
.captureContentTypeHeader
- (Optional) The content type headers to capture. Fields are documented below.
captureOptions
captureMode
- (Required) Specifies the data to be captured. Should be one ofinput
oroutput
.
captureContentTypeHeader
csvContentTypes
- (Optional) The CSV content type headers to capture.jsonContentTypes
- (Optional) The JSON content type headers to capture.
asyncInferenceConfig
outputConfig
- (Required) Specifies the configuration for asynchronous inference invocation outputs.clientConfig
- (Optional) Configures the behavior of the client used by Amazon SageMaker to interact with the model container during asynchronous inference.
clientConfig
maxConcurrentInvocationsPerInstance
- (Optional) The maximum number of concurrent requests sent by the SageMaker client to the model container. If no value is provided, Amazon SageMaker will choose an optimal value for you.
outputConfig
s3OutputPath
- (Required) The Amazon S3 location to upload inference responses to.kmsKeyId
- (Optional) The Amazon Web Services Key Management Service (Amazon Web Services KMS) key that Amazon SageMaker uses to encrypt the asynchronous inference output in Amazon S3.notificationConfig
- (Optional) Specifies the configuration for notifications of inference results for asynchronous inference.
notificationConfig
errorTopic
- (Optional) Amazon SNS topic to post a notification to when inference fails. If no topic is provided, no notification is sent on failure.successTopic
- (Optional) Amazon SNS topic to post a notification to when inference completes successfully. If no topic is provided, no notification is sent on success.
Attributes Reference
In addition to all arguments above, the following attributes are exported:
arn
- The Amazon Resource Name (ARN) assigned by AWS to this endpoint configuration.name
- The name of the endpoint configuration.tagsAll
- A map of tags assigned to the resource, including those inherited from the providerdefaultTags
configuration block.
Import
Endpoint configurations can be imported using the name
, e.g.,