Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow importing of egress_lambda_log_group if getting an "already exists" error #191

Merged
merged 8 commits into from
May 31, 2024

Conversation

lindsleycj
Copy link
Collaborator

When I run make cumulus for an environment that sends logs to Cloud Metrics, UAT and Prod for me, I often get this error:

Error: creating CloudWatch Logs Log Group (/aws/lambda/my-environment-thin-egress-app-EgressLambda): 
operation error CloudWatch Logs: CreateLogGroup, https response error StatusCode: 400, 
RequestID: dbde8490-6ccb-495f-98c8-5bcdcbb6b2f7, ResourceAlreadyExistsException: The specified log 
group already exists

These lines:

https://github.com/asfadmin/CIRRUS-core/blob/master/cumulus/thin-egress.tf#L67-L72

This PR attempts to only create the log group if it doesn't already exist. It also creates the Cloud Metrics subscription based on the whether the log watch group was just created or already exists.

@lindsleycj
Copy link
Collaborator Author

Have you all seen this problem?

@reweeden
Copy link
Contributor

reweeden commented Mar 6, 2024

I've never seen this, but it looks like we don't use the log_destination_arn variable at ASF so that lines up.

Comment on lines 68 to 70
data "aws_cloudwatch_log_group" "egress_lambda_log_group" {
name = "/aws/lambda/${module.thin_egress_app.egress_lambda_name}"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's possible to use an import block from terraform 1.5 here? https://developer.hashicorp.com/terraform/language/import

I've never used these before so I have no idea if they are meant to be used for this sort of situation or would work well, but perhaps it's a way to avoid having both a data and a resource object for the same cloudwatch group.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be able to avoid having to define 2 filter resources that way.

Copy link
Collaborator Author

@lindsleycj lindsleycj Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just created my own kinesis stream and log-put-destination in my sandbox account per these instructions so I could emulate the cloud metrics subscription filter.

https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CreateDestination.html

I then deleted the my-deployment-thin-egress-app-EgressLambda cloudwatch group and ran the code.

got this error:

Error: reading CloudWatch Logs Log Group (/aws/lambda/cxl-cumulus-cxl-thin-egress-app-EgressLambda): empty result

  with data.aws_cloudwatch_log_group.egress_lambda_log_group,
  on thin-egress.tf line 68, in data "aws_cloudwatch_log_group" "egress_lambda_log_group":
  68: data "aws_cloudwatch_log_group" "egress_lambda_log_group" {

So the code as-written doesn't work if the cloudwatch group doesn't exist.

@lindsleycj
Copy link
Collaborator Author

@mikedorfman do you all use the log_destination_arn variable at NSIDC for reporting TEA info to CloudMetrics. I wonder if this is not really needed anymore. Doesn't Cloud Metrics get all it's distribution information via S3 replication now?

@mikedorfman
Copy link
Collaborator

@mikedorfman do you all use the log_destination_arn variable at NSIDC for reporting TEA info to CloudMetrics. I wonder if this is not really needed anymore. Doesn't Cloud Metrics get all it's distribution information via S3 replication now?

@lindsleycj - we do not use log_destination_arn to report TEA info to CloudMetrics.

Copy link
Collaborator

@mikedorfman mikedorfman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment on the retention. I'm not sure of a way around this - perhaps using the import block that @reweeden mentioned. Or perhaps you could make the TEA stack dependent on the log group resource (although you'd have to hard-code the log group name since that would end up being a circular reference otherwise).

resource "aws_cloudwatch_log_group" "egress_lambda_log_group" {
count = (var.log_destination_arn != null) ? 1 : 0
count = (var.log_destination_arn != null && data.aws_cloudwatch_log_group.egress_lambda_log_group == null) ? 1 : 0
name = "/aws/lambda/${module.thin_egress_app.egress_lambda_name}"
retention_in_days = var.egress_lambda_log_retention_days
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that NSIDC has run into recently and we don't have a solid solution for it yet. If the lambda writes to this log group before it's defined in the TF, then (with these changes), I don't think the retention_in_days would get set, right?

@lindsleycj
Copy link
Collaborator Author

lindsleycj commented Mar 15, 2024

@reweeden @mikedorfman I have reverted my previous changes and have instead added an example import file. I can't just include it as a .tf file for two reasons. 1. The very first time you deploy a Cumulus stack the log group doesn't exist and the import will fail. 2. Terraform is really particular in how you define the id to import. I couldn't even use a fully defined string variable. What do you think of this.

@lindsleycj
Copy link
Collaborator Author

Probably the better way to handle all of this would be to do it in the TEA Cloudformation stack, but I assume that's not trivial either.

@lindsleycj
Copy link
Collaborator Author

I have tried another approach, adding an import-thin-egress-log target to the Makefile. That way we can make use of the DEPLOY-NAME and MATURITY. So, if you get the log already exists error during make cumulus you can run make import-thin-egress-log and then make cumulus again. What do you think?

@lindsleycj lindsleycj self-assigned this Apr 18, 2024
@lindsleycj lindsleycj changed the title only create egress_lambda_log_group if it doesn't already exist Allow importing of egress_lambda_log_group if getting an "already exists" error Apr 18, 2024
@lindsleycj
Copy link
Collaborator Author

@reweeden @mikedorfman I renamed this PR to better represent the solution I settled on. It has been working for me for a few deployments. Is it ok to merge it?

@mikedorfman
Copy link
Collaborator

mikedorfman commented May 29, 2024

I'm sorry for the very long delay! I didn't realize there were outstanding questions. @lindsleycj one other thought I had as I work through issues related to this - could you rename the log group to something other than the default lambda log group (eg here, instead of /aws/lambda..., specify /cumulus/lambda)? I'll test out this avenue to see how well it works but wanted to see if you had tried that.

Update you can disregard my comment above. We wouldn't be able to specify a different log group for the lambda to use unless we updated the TEA CloudFormation. Even if we could, we'd be in the same chicken-egg problem as before with the new name. Hmmm.

@mikedorfman
Copy link
Collaborator

This looks reasonable to me. To make this more automated, we could potentially put this in the cumulus target similar to how we have it set up in the tf target. But this looks good to me.

@lindsleycj lindsleycj merged commit 52ed7be into master May 31, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants