Resource: awsComprehendDocumentClassifier
Terraform resource for managing an AWS Comprehend Document Classifier.
Example Usage
Basic Usage
/*Provider bindings are generated by running cdktf get.
See https://cdk.tf/provider-generation for more details.*/
import * as aws from "./.gen/providers/aws";
const awsS3ObjectDocuments = new aws.s3Object.S3Object(this, "documents", {});
new aws.s3Object.S3Object(this, "entities", {});
new aws.comprehendDocumentClassifier.ComprehendDocumentClassifier(
this,
"example",
{
dataAccessRoleArn: "${aws_iam_role.example.arn}",
depends_on: ["${aws_iam_role_policy.example}"],
inputDataConfig: {
s3Uri: `s3://\${aws_s3_bucket.test.bucket}/\${${awsS3ObjectDocuments.id}}`,
},
languageCode: "en",
name: "example",
}
);
Argument Reference
The following arguments are required:
dataAccessRoleArn
- (Required) The ARN for an IAM Role which allows Comprehend to read the training and testing data.inputDataConfig
- (Required) Configuration for the training and testing data. See theinputDataConfig
Configuration Block section below.languageCode
- (Required) Two-letter language code for the language. One ofen
,es
,fr
,it
,de
, orpt
.name
- (Required) Name for the Document Classifier. Has a maximum length of 63 characters. Can contain upper- and lower-case letters, numbers, and hypen (-
).
The following arguments are optional:
mode
- (Optional, Default:MULTI_CLASS
) The document classification mode. One ofMULTI_CLASS
orMULTI_LABEL
.MULTI_CLASS
is also known as "Single Label" in the AWS Console.modelKmsKeyId
- (Optional) KMS Key used to encrypt trained Document Classifiers. Can be a KMS Key ID or a KMS Key ARN.outputDataConfig
- (Optional) Configuration for the output results of training. See theoutputDataConfig
Configuration Block section below.tags
- (Optional) A map of tags to assign to the resource. If configured with a providerdefaultTags
Configuration Block present, tags with matching keys will overwrite those defined at the provider-level.versionName
- (Optional) Name for the version of the Document Classifier. Each version must have a unique name within the Document Classifier. If omitted, Terraform will assign a random, unique version name. If explicitly set to""
, no version name will be set. Has a maximum length of 63 characters. Can contain upper- and lower-case letters, numbers, and hypen (-
). Conflicts withversionNamePrefix
.versionNamePrefix
- (Optional) Creates a unique version name beginning with the specified prefix. Has a maximum length of 37 characters. Can contain upper- and lower-case letters, numbers, and hypen (-
). Conflicts withversionName
.volumeKmsKeyId
- (Optional) KMS Key used to encrypt storage volumes during job processing. Can be a KMS Key ID or a KMS Key ARN.vpcConfig
- (Optional) Configuration parameters for VPC to contain Document Classifier resources. See thevpcConfig
Configuration Block section below.
inputDataConfig
Configuration Block
augmentedManifests
- (Optional) List of training datasets produced by Amazon SageMaker Ground Truth. Used ifdataFormat
isAUGMENTED_MANIFEST
. See theaugmentedManifests
Configuration Block section below.dataFormat
- (Optional, Default:COMPREHEND_CSV
) The format for the training data. One ofCOMPREHEND_CSV
orAUGMENTED_MANIFEST
.labelDelimiter
- (Optional) Delimiter between labels when training a multi-label classifier. Valid values are|
,~
,!
,@
,#
,$
,%
,^
,*
,-
,_
,+
,=
,\
,:
,;
,>
,?
,/
,<space>
, and<tab>
. Default is|
.s3Uri
- (Optional) Location of training documents. Used ifdataFormat
isCOMPREHEND_CSV
.testS3Uri
- (Optional) Location of test documents.
augmentedManifests
Configuration Block
annotationDataS3Uri
- (Optional) Location of annotation files.attributeNames
- (Required) The JSON attribute that contains the annotations for the training documents.documentType
- (Optional, Default:PLAIN_TEXT_DOCUMENT
) Type of augmented manifest. One ofPLAIN_TEXT_DOCUMENT
orSEMI_STRUCTURED_DOCUMENT
.s3Uri
- (Required) Location of augmented manifest file.sourceDocumentsS3Uri
- (Optional) Location of source PDF files.split
- (Optional, Default:train
) Purpose of data in augmented manifest. One oftrain
ortest
.
outputDataConfig
Configuration Block
kmsKeyId
- (Optional) KMS Key used to encrypt the output documents. Can be a KMS Key ID, a KMS Key ARN, a KMS Alias name, or a KMS Alias ARN.outputS3Uri
- (Computed) Full path for the output documents.s3Uri
- (Required) Destination path for the output documents. The full path to the output file will be returned inoutputS3Uri
.
vpcConfig
Configuration Block
securityGroupIds
- (Required) List of security group IDs.subnets
- (Required) List of VPC subnets.
Attributes Reference
In addition to all arguments above, the following attributes are exported:
arn
- ARN of the Document Classifier version.tagsAll
- A map of tags assigned to the resource, including those inherited from the providerdefaultTags
configuration block.
Timeouts
awsComprehendDocumentClassifier
provides the following Timeouts configuration options:
create
- (Optional, Default:60M
)update
- (Optional, Default:60M
)delete
- (Optional, Default:30M
)
Import
Comprehend Document Classifier can be imported using the ARN, e.g.,