Resource: awsComprehendEntityRecognizer
Terraform resource for managing an AWS Comprehend Entity Recognizer.
Example Usage
Basic Usage
/*Provider bindings are generated by running cdktf get.
See https://cdk.tf/provider-generation for more details.*/
import * as aws from "./.gen/providers/aws";
const awsS3ObjectDocuments = new aws.s3Object.S3Object(this, "documents", {});
const awsS3ObjectEntities = new aws.s3Object.S3Object(this, "entities", {});
new aws.comprehendEntityRecognizer.ComprehendEntityRecognizer(this, "example", {
dataAccessRoleArn: "${aws_iam_role.example.arn}",
depends_on: ["${aws_iam_role_policy.example}"],
inputDataConfig: {
documents: {
s3Uri: `s3://\${aws_s3_bucket.documents.bucket}/\${${awsS3ObjectDocuments.id}}`,
},
entityList: {
s3Uri: `s3://\${aws_s3_bucket.entities.bucket}/\${${awsS3ObjectEntities.id}}`,
},
entityTypes: [
{
type: "ENTITY_1",
},
{
type: "ENTITY_2",
},
],
},
languageCode: "en",
name: "example",
});
Argument Reference
The following arguments are required:
dataAccessRoleArn
- (Required) The ARN for an IAM Role which allows Comprehend to read the training and testing data.inputDataConfig
- (Required) Configuration for the training and testing data. See theinputDataConfig
Configuration Block section below.languageCode
- (Required) Two-letter language code for the language. One ofen
,es
,fr
,it
,de
, orpt
.name
- (Required) Name for the Entity Recognizer. Has a maximum length of 63 characters. Can contain upper- and lower-case letters, numbers, and hypen (-
).
The following arguments are optional:
modelKmsKeyId
- (Optional) The ID or ARN of a KMS Key used to encrypt trained Entity Recognizers.tags
- (Optional) A map of tags to assign to the resource. If configured with a providerdefaultTags
Configuration Block present, tags with matching keys will overwrite those defined at the provider-level.versionName
- (Optional) Name for the version of the Entity Recognizer. Each version must have a unique name within the Entity Recognizer. If omitted, Terraform will assign a random, unique version name. If explicitly set to""
, no version name will be set. Has a maximum length of 63 characters. Can contain upper- and lower-case letters, numbers, and hypen (-
). Conflicts withversionNamePrefix
.versionNamePrefix
- (Optional) Creates a unique version name beginning with the specified prefix. Has a maximum length of 37 characters. Can contain upper- and lower-case letters, numbers, and hypen (-
). Conflicts withversionName
.volumeKmsKeyId
- (Optional) ID or ARN of a KMS Key used to encrypt storage volumes during job processing.vpcConfig
- (Optional) Configuration parameters for VPC to contain Entity Recognizer resources. See thevpcConfig
Configuration Block section below.
inputDataConfig
Configuration Block
annotations
- (Optional) Specifies location of the document annotation data. See theannotations
Configuration Block section below. One ofannotations
orentityList
is required.augmentedManifests
- (Optional) List of training datasets produced by Amazon SageMaker Ground Truth. Used ifdataFormat
isAUGMENTED_MANIFEST
. See theaugmentedManifests
Configuration Block section below.dataFormat
- (Optional, Default:COMPREHEND_CSV
) The format for the training data. One ofCOMPREHEND_CSV
orAUGMENTED_MANIFEST
.documents
- (Optional) Specifies a collection of training documents. Used ifdataFormat
isCOMPREHEND_CSV
. See thedocuments
Configuration Block section below.entityList
- (Optional) Specifies location of the entity list data. See theentityList
Configuration Block section below. One ofentityList
orannotations
is required.entityTypes
- (Required) Set of entity types to be recognized. Has a maximum of 25 items. See theentityTypes
Configuration Block section below.
annotations
Configuration Block
s3Uri
- (Required) Location of training annotations.testS3Uri
- (Optional) Location of test annotations.
augmentedManifests
Configuration Block
annotationDataS3Uri
- (Optional) Location of annotation files.attributeNames
- (Required) The JSON attribute that contains the annotations for the training documents.documentType
- (Optional, Default:PLAIN_TEXT_DOCUMENT
) Type of augmented manifest. One ofPLAIN_TEXT_DOCUMENT
orSEMI_STRUCTURED_DOCUMENT
.s3Uri
- (Required) Location of augmented manifest file.sourceDocumentsS3Uri
- (Optional) Location of source PDF files.split
- (Optional, Default:train
) Purpose of data in augmented manifest. One oftrain
ortest
.
documents
Configuration Block
inputFormat
- (Optional, Default:ONE_DOC_PER_LINE
) Specifies how the input files should be processed. One ofONE_DOC_PER_LINE
orONE_DOC_PER_FILE
.s3Uri
- (Required) Location of training documents.testS3Uri
- (Optional) Location of test documents.
entityList
Configuration Block
s3Uri
- (Required) Location of entity list.
entityTypes
Configuration Block
type
- (Required) An entity type to be matched by the Entity Recognizer. Cannot contain a newline (\n
), carriage return (\r
), or tab (\t
).
vpcConfig
Configuration Block
securityGroupIds
- (Required) List of security group IDs.subnets
- (Required) List of VPC subnets.
Attributes Reference
In addition to all arguments above, the following attributes are exported:
arn
- ARN of the Entity Recognizer version.tagsAll
- A map of tags assigned to the resource, including those inherited from the providerdefaultTags
configuration block.
Timeouts
awsComprehendEntityRecognizer
provides the following Timeouts configuration options:
create
- (Optional, Default:60M
)update
- (Optional, Default:60M
)delete
- (Optional, Default:30M
)
Import
Comprehend Entity Recognizer can be imported using the ARN, e.g.,