Skip to content

Conversation

@vandonr-amz
Copy link
Contributor

@vandonr-amz vandonr-amz commented Apr 19, 2023

follow up on #30720

It is not possible to stop an application that has running jobs.
Applications can be stopped as part of the normal process, or in a degraded/timeout scenario, in which case we really want to stop the app no matter what. It's nice to have an option to do that from the operator directly.

NOTE: If extra jobs are launched after the stop/delete operator is called, then it's not going to work properly, and will likely end on a timeout in the waiter, but to support that we'd need to go back and forth between get_jobs and cancel_jobs, possibly never converging.
I'm assuming that this is good enough for most use cases.

Comment on lines 257 to 260
@cached_property
def conn(self):
"""Get the underlying boto3 EmrServerlessAPIService client (cached)"""
return super().conn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing to do with the sauce, but I took the opportunity to clean this as I was around

Comment on lines +8 to +15
"acceptors": [
{
"matcher": "path",
"argument": "length(jobRuns) == `0`",
"expected": true,
"state": "success"
}
]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like too much the fact that I don't have a failure case for this waiter, but I think there is nothing we can do about it...
The failure case would be count > prev_count, or detecting a new job_id that we didn't see in the previous calls, which is way beyond the capabilities of waiters.

Comment on lines -103 to +104
start_job.waiter_check_interval_seconds = 10
start_job.wait_for_completion = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of waiting for the job to finish, we just trigger it and continue...

aws_conn_id: str = "aws_default",
waiter_countdown: int = 5 * 60,
waiter_check_interval_seconds: int = 30,
force_stop: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about updating the docs/apache-airflow-providers-amazon/operators/emr/emr_serverless.rst to contain this additional param

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it to the system test (https://github.com/apache/airflow/pull/30757/files/8370c8172d8c28ffeec3ca460621b9d0447bfba6#diff-69be45953c5be696ca3159bb385a89c90930fc06e68357e8bdb33a1b31694f88R120) inside the howto_operator thing, so it'll be embeded in the doc automatically (as sample usage).
It doesn't explain what it's doing, but it being present there highlights its presence, and the code doc contains the explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@o-nikolas
Copy link
Contributor

Other than the conflicts that need resolving this looks ready to merge, unless you have any other feedback @phanikumv?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants