Skip to content

Commit

Permalink
feat(logs): support data protection custom data identifiers (#28553)
Browse files Browse the repository at this point in the history
### Feature

Support the newly launched custom data identifiers feature for CloudWatch Logs sensitive data protection.

https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL-custom-data-identifiers.html

### Use Case

Custom data identifiers (CDIs) let you define your own custom regular expressions that can be used in your data protection policy. Using custom data identifiers, you can target business-specific personally identifiable information (PII) use cases that managed data identifiers can't provide. For example, you can use a custom data identifier to look for company-specific employee IDs. Custom data identifiers can be used in conjunction with managed data identifiers.

### Solution
Users can now supply a `regex` field to the `DataIdentifiers` constructor. Supplying this field will enable the named identifier as a custom data identifier. 

Closes #28430.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
  • Loading branch information
kchg authored Jan 19, 2024
1 parent 414c570 commit 1222aaa
Show file tree
Hide file tree
Showing 12 changed files with 269 additions and 43 deletions.
3 changes: 3 additions & 0 deletions allowed-breaking-changes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,9 @@ removed:aws-cdk-lib.aws_backup.BackupPlanRuleProps.schedule
# This data identifer was added by mistake; it had never worked.
removed:aws-cdk-lib.aws_logs.DataIdentifier.PHONENUMBER

# This interface should not have been exported, it is not used in any public way.
removed:aws-cdk-lib.aws_logs.DataProtectionPolicyConfig

# These newly exported classes have been reverted and are no longer publicly consumeable
removed:aws-cdk-lib.custom_resources.WaiterStateMachine
removed:aws-cdk-lib.custom_resources.LogOptions
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,14 @@
"name": "policy-name",
"description": "policy description",
"version": "2021-06-01",
"configuration": {
"customDataIdentifier": [
{
"name": "EmployeeId",
"regex": "EmployeeId-\\d{9}"
}
]
},
"statement": [
{
"sid": "audit-statement-cdk",
Expand Down Expand Up @@ -47,7 +55,8 @@
":dataprotection::aws:data-identifier/EmailAddress"
]
]
}
},
"EmployeeId"
],
"operation": {
"audit": {
Expand Down Expand Up @@ -92,7 +101,8 @@
":dataprotection::aws:data-identifier/EmailAddress"
]
]
}
},
"EmployeeId"
],
"operation": {
"deidentify": {
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { Bucket } from 'aws-cdk-lib/aws-s3';
import { App, Stack, StackProps } from 'aws-cdk-lib';
import { IntegTest } from '@aws-cdk/integ-tests-alpha';
import { LogGroup, DataProtectionPolicy, DataIdentifier } from 'aws-cdk-lib/aws-logs';
import { LogGroup, DataProtectionPolicy, DataIdentifier, CustomDataIdentifier } from 'aws-cdk-lib/aws-logs';

class LogGroupIntegStack extends Stack {
constructor(scope: App, id: string, props?: StackProps) {
Expand All @@ -14,7 +14,7 @@ class LogGroupIntegStack extends Stack {
const dataProtectionPolicy = new DataProtectionPolicy({
name: 'policy-name',
description: 'policy description',
identifiers: [DataIdentifier.DRIVERSLICENSE_US, new DataIdentifier('EmailAddress')],
identifiers: [DataIdentifier.DRIVERSLICENSE_US, new DataIdentifier('EmailAddress'), new CustomDataIdentifier('EmployeeId', 'EmployeeId-\\d{9}')],
logGroupAuditDestination: audit,
s3BucketAuditDestination: bucket,
});
Expand Down
12 changes: 9 additions & 3 deletions packages/aws-cdk-lib/aws-logs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,9 +342,12 @@ Creates a data protection policy and assigns it to the log group. A data protect

For more information, see [Protect sensitive log data with masking](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/mask-sensitive-log-data.html).

For a list of types of identifiers that can be audited and masked, see [Types of data that you can protect](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/protect-sensitive-log-data-types.html)
For a list of types of managed identifiers that can be audited and masked, see [Types of data that you can protect](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/protect-sensitive-log-data-types.html).

If a new identifier is supported but not yet in the `DataIdentifiers` enum, the full ARN of the identifier can be supplied in `identifierArnStrings` instead.
If a new identifier is supported but not yet in the `DataIdentifiers` enum, the name of the identifier can be supplied as `name` in the constructor instead.

To add a custom data identifier, supply a custom `name` and `regex` to the `CustomDataIdentifiers` constructor.
For more information on custom data identifiers, see [Custom data identifiers](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL-custom-data-identifiers.html).

Each policy may consist of a log group, S3 bucket, and/or Firehose delivery stream audit destination.

Expand All @@ -368,7 +371,10 @@ const deliveryStream = new kinesisfirehose.DeliveryStream(this, 'Delivery Stream
const dataProtectionPolicy = new logs.DataProtectionPolicy({
name: 'data protection policy',
description: 'policy description',
identifiers: [logs.DataIdentifier.DRIVERSLICENSE_US, new logs.DataIdentifier('EmailAddress')],
identifiers: [
logs.DataIdentifier.DRIVERSLICENSE_US, // managed data identifier
new logs.DataIdentifier('EmailAddress'), // forward compatibility for new managed data identifiers
new logs.CustomDataIdentifier('EmployeeId', 'EmployeeId-\\d{9}')], // custom data identifier
logGroupAuditDestination: logGroupDestination,
s3BucketAuditDestination: bucket,
deliveryStreamNameAuditDestination: deliveryStream.deliveryStreamName,
Expand Down
101 changes: 78 additions & 23 deletions packages/aws-cdk-lib/aws-logs/lib/data-protection-policy.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ export class DataProtectionPolicy {
const description = this.dataProtectionPolicyProps.description || 'cdk generated data protection policy';
const version = '2021-06-01';

const findingsDestination: FindingsDestination = {};
const findingsDestination: PolicyFindingsDestination = {};
if (this.dataProtectionPolicyProps.logGroupAuditDestination) {
findingsDestination.cloudWatchLogs = {
logGroup: this.dataProtectionPolicyProps.logGroupAuditDestination.logGroupName,
Expand All @@ -43,21 +43,30 @@ export class DataProtectionPolicy {
};
}

const identifierArns: string[] = [];
const identifiers: string[] = [];
const customDataIdentifiers: PolicyCustomDataIdentifier[] = [];
for (let identifier of this.dataProtectionPolicyProps.identifiers) {
identifierArns.push(Stack.of(_scope).formatArn({
resource: 'data-identifier',
region: '',
account: 'aws',
service: 'dataprotection',
resourceName: identifier.toString(),
}));
if (identifier instanceof CustomDataIdentifier) {
identifiers.push(identifier.name);
customDataIdentifiers.push({
name: identifier.name,
regex: identifier.regex,
});
} else {
identifiers.push(Stack.of(_scope).formatArn({
resource: 'data-identifier',
region: '',
account: 'aws',
service: 'dataprotection',
resourceName: identifier.name,
}));
}
};

const statement = [
{
sid: 'audit-statement-cdk',
dataIdentifier: identifierArns,
dataIdentifier: identifiers,
operation: {
audit: {
findingsDestination: findingsDestination,
Expand All @@ -66,40 +75,53 @@ export class DataProtectionPolicy {
},
{
sid: 'redact-statement-cdk',
dataIdentifier: identifierArns,
dataIdentifier: identifiers,
operation: {
deidentify: {
maskConfig: {},
},
},
},
];
return { name, description, version, statement };

const configuration: PolicyConfiguration = {
customDataIdentifier: customDataIdentifiers,
};
return { name, description, version, configuration, statement };
}
}

interface FindingsDestination {
cloudWatchLogs?: CloudWatchLogsDestination;
firehose?: FirehoseDestination;
s3?: S3Destination;
interface PolicyConfiguration {
customDataIdentifier?: PolicyCustomDataIdentifier[];
}

interface PolicyCustomDataIdentifier {
name: string;
regex: string;
}

interface PolicyFindingsDestination {
cloudWatchLogs?: PolicyCloudWatchLogsDestination;
firehose?: PolicyFirehoseDestination;
s3?: PolicyS3Destination;
}

interface CloudWatchLogsDestination {
interface PolicyCloudWatchLogsDestination {
logGroup: string;
}

interface FirehoseDestination {
interface PolicyFirehoseDestination {
deliveryStream: string;
}

interface S3Destination {
interface PolicyS3Destination {
bucket: string;
}

/**
* Interface representing a data protection policy
*/
export interface DataProtectionPolicyConfig {
interface DataProtectionPolicyConfig {
/**
* Name of the data protection policy
*
Expand All @@ -119,6 +141,11 @@ export interface DataProtectionPolicyConfig {
*/
readonly version: string;

/**
* Configuration of the data protection policy. Currently supports custom data identifiers
*/
readonly configuration: PolicyConfiguration;

/**
* Statements within the data protection policy. Must contain one Audit and one Redact statement
*/
Expand All @@ -144,8 +171,10 @@ export interface DataProtectionPolicyProps {
readonly description?: string;

/**
* List of data protection identifiers. Must be in the following list: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/protect-sensitive-log-data-types.html
* List of data protection identifiers.
*
* Managed data identifiers must be in the following list: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL-managed-data-identifiers.html
* Custom data identifiers must have a valid regex defined: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL-custom-data-identifiers.html#custom-data-identifiers-constraints
*/
readonly identifiers: DataIdentifier[];

Expand Down Expand Up @@ -274,9 +303,35 @@ export class DataIdentifier {
public static readonly VEHICLEIDENTIFICATIONNUMBER = new DataIdentifier('VehicleIdentificationNumber');
public static readonly ZIPCODE_US = new DataIdentifier('ZipCode-US');

constructor(private readonly identifier: string) { }
/**
* Create a managed data identifier not in the list of static members. This is used to maintain forward compatibility, in case a new managed identifier is supported but not updated in CDK yet.
* @param name - name of the identifier.
*/
constructor(public readonly name: string) { }

public toString(): string {
return this.identifier;
return this.name;
}
}

/**
* A custom data identifier. Include a custom data identifier name and regular expression in the JSON policy used to define the data protection policy.
*/
export class CustomDataIdentifier extends DataIdentifier {
/**
* Create a custom data identifier.
* @param name - the name of the custom data identifier. This cannot share the same name as a managed data identifier.
* @param regex - the regular expresssion to detect and mask log events for.
*/
constructor(public readonly name: string, public readonly regex: string) {
super(name);
}

/**
* String representation of a CustomDataIdentifier
* @returns the name and RegEx of the custom data identifier
*/
public toString(): string {
return `${this.name}: ${this.regex}`;
}
}
Loading

0 comments on commit 1222aaa

Please sign in to comment.