-
Notifications
You must be signed in to change notification settings - Fork 4.3k
feat(stepfunctions-tasks): EmrCreateClusterOptions support ebsRootVolumeIops, ebsRootVolumeThroughput and managedScalingPolicy
#34677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
kumvprat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising the PR.
I have added inline comments where possible.
I have a high level question on the PR : It seems like the ebs root volume was already present as a property in the create emr cluster sfn task.
Is this change not impacting any other integration tests that might be testing emr cluster create task with any other ebs properties ?
packages/aws-cdk-lib/aws-stepfunctions-tasks/lib/emr/emr-create-cluster.ts
Show resolved
Hide resolved
packages/aws-cdk-lib/aws-stepfunctions-tasks/lib/emr/emr-create-cluster.ts
Show resolved
Hide resolved
packages/aws-cdk-lib/aws-stepfunctions-tasks/lib/emr/emr-create-cluster.ts
Show resolved
Hide resolved
| }); | ||
| start.next(describe); | ||
|
|
||
| describe.expect(ExpectedResult.objectLike({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a check here that the EMR cluster is created ? And also have a check that the cluster configuration match the cluster configuration provided in the step function step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure let me add, might take some time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verification added.
In case you have a question why we need 2 describeExecution calls, just found out that we cannot have below 2 things at the same time:
- Validate execution with
.expect(ExpectedResult.objectLike({ status: 'SUCCEEDED', })) - Retrieve clusterId with
.getAttString('output.ClusterId')
Because when we do .getAttString('output.ClusterId'), the response return nothing besides from ClusterId, making status: 'SUCCEEDED' fail although the response actually have it. Thats why we need 2 describeExecution calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that the timeout is. :
totalTimeout: Duration.minutes(5)
Is this enough for the whole emr cluster to be created ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is enough. It works everytime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this statement from the doc When you launch a cluster, either for the first time or when the AWSServiceRoleForEMRCleanup service-linked role is not present, Amazon EMR creates the AWSServiceRoleForEMRCleanup service-linked role for you. is not true, at least in this case when StepFunction trigger it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we should add the permissions for creating service linked role, in an idempotent manner
Because it seems like the cluster is not created properly without the role, and requires the ability to create the service linked role for cluster creation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are the required permissions for that, from the same link :
You must have permissions to create a service-linked role. For an example statement that adds this capability to the permissions policy of an IAM entity (such as a user, group, or role):
Add the following statement to the permissions policy for the IAM entity that needs to create the service-linked role.
{
"Sid": "ElasticMapReduceServiceLinkedRole",
"Effect": "Allow",
"Action": "iam:CreateServiceLinkedRole",
"Resource": "arn:aws:iam:::role/aws-service-role/elasticmapreduce.amazonaws.com/AWSServiceRoleForEMRCleanup*",
"Condition": {
"StringEquals": {
"iam:AWSServiceName": [
"elasticmapreduce.amazonaws.com",
"elasticmapreduce.amazonaws.com.cn"
]
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes may not be exactly this but would need some changes in the policy creation during the task creation : maybe we should change the policies added here
Since when using this stepfunction task the user might utilize the emr cluster created from this task for running jobs => The cluster creation should be successful once the task completes
(I have not used this feature personally, and it seems you have. So I would like to ask if adding these set of permissions would make this functionality easier to use? )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding these set of permissions would make this functionality easier to use?
You're great mate. Can confirm that if StepFunction/StateMachine has iam:CreateServiceLinkedRole then user doesn't need to precreate AWSServiceRoleForEMRCleanup. Better user experience of course.
Idempotent: YES. I executed the job 2 times and no issue when the role exists.
2c27e56 to
5d92a2f
Compare
packages/aws-cdk-lib/aws-stepfunctions-tasks/lib/emr/emr-create-cluster.ts
Outdated
Show resolved
Hide resolved
5d92a2f to
f3f171c
Compare
|
Sorry I forgot these questions:
Only EBSsize prop exist, not IOPS+throughput
Yes, new props will be undefined for current test/user. We know this since no change in other snapshots. |
55f08c9 to
2813ef0
Compare
|
Still need security approval process but really appreciate @kumvprat for your prompt support so far 🚀 You're very helpful. |
packages/aws-cdk-lib/aws-stepfunctions-tasks/lib/emr/emr-create-cluster.ts
Outdated
Show resolved
Hide resolved
…umeIops, ebsRootVolumeThroughput and managedScalingPolicy
2813ef0 to
3964c6f
Compare
kumvprat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!
|
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
|
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
|
Comments on closed issues and PRs are hard for our team to see. |
Issue # (if applicable)
Closes #33431
Reason for this change
Missing param in EmrCreateClusterOptions
Description of changes
validatePropsfunctionManagedScalingPolicy.ScalingStrategy is supported by awscli/API, but not yet supported by Step Function. Therefore not included in this PR

Describe any new or updated permissions being added
Add
iam:CreateServiceLinkedRolepermission for only EMR to createAWSServiceRoleForEMRCleanupas instructed by https://docs.aws.amazon.com/emr/latest/ManagementGuide/using-service-linked-roles-cleanup.html#create-service-linked-roleDescription of how you validated changes
Unit + Integ
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license