-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(custom-resource): implement lambda function that validate certificate and add aliases for NLB #3057
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall! Just a reminder even for create we still need to handle the case when alias is in app/root hosted zone.
const { CertificateArn } =await acm.requestCertificate({ | ||
DomainName: certificateDomain, | ||
IdempotencyToken: physicalResourceID, | ||
SubjectAlternativeNames: aliases.size === 0? null: [...aliases], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it throw an error if we pass in an empty array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SubjectAlternativeNames
needs to either be specified with an array with length >= 1, or not specified at all (by not including the key, or having null
as its value). So it'll throw an error if we pass in an empty array, but it won't if we pass in a null
.
async function waitForValidationOptionsToBeReady(certificateARN, aliases) { | ||
let expectedCount = aliases.size + 1; // Expect one validation option for each alias and the cert domain. | ||
|
||
let attempt; // TODO: This wait loops could be further abstracted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess when we add update/delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes in a following PR - especially because DELETE also needs to have a wait loop like this. I think it may improve code's readability if this can be abstracted.
if (!recordSet || recordSet.length === 0) { | ||
continue; // The alias record does not exist in the hosted zone, hence valid. | ||
} | ||
if (recordSet[0].AliasTarget && recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (recordSet[0].AliasTarget && recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) { | |
if (recordSet[0].AliasTarget?.DNSName === `${loadBalancerDNS}.`) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an ECMA 2020 syntax (by default we are using ECMA 5 I think). I tried enabling the ECMA 2020 parser but it still fails to acknowledge the optional chaining syntax. I will use this syntax after I figure out how to really enable ECMA 2020 :)
if (recordSet[0].AliasTarget && recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) { | ||
continue; // The record is an alias record and is in use by myself, hence valid. | ||
} | ||
throw new Error(`alias ${alias} is in use`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe alias is in used by xxx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure if we should include the record value or alias target in the error message 🤔 If the custom domain is in use, it's possible that it's in use by another service (hence the alias target is a copilot LB's DNS); however, it's also possible that the record is a non-alias record, pointing to whatever address.
1f03dfd
to
b4225e4
Compare
…of call arguments for each error cases
certificateDomain = `${serviceName}-nlb.${envName}.${appName}.${domainName}`; | ||
|
||
let aliasesSorted = [...aliases].sort().join(","); | ||
const physicalResourceID = `/${serviceName}/${aliasesSorted}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Do you know if there is a limit on the length of the physical resource ID? Would you mind investigating that, I tried a quick search but couldn't find anything might be worthwhile experimenting.
-
Using the aliases in the resource ID makes total sense because we want to delete the old certificate right if the alias changes? if so can we add that as a short comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer 2.:
Yes. For each unique combination of [service name + alias names]
, the [certificate ('s subject alternative name) + validation records + A records] needed should be the same. If there is no change in [service name + alias names]
, then the certificate shouldn't change, neither do the validation records and the A records; on the contrary, if a change is detected in [service name + alias names]
, then a replacement
surely needs to happen to at least one of the three resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked offline but I'm posting here so that this info is shared:
Answering 1.: There is virtually no limit on Physical Resource ID length - tried a 800-character long ID, and it was fine. However, if the ID is too long, the response object sent from the lambda will be too big, causing the custom resource to fail.
if (recordSet[0].AliasTarget && recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) { | ||
return; // The record is an alias record and is in use by myself, hence valid. | ||
} | ||
throw new Error(`alias ${alias} is in use`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also add additional info for which LB uses it instead of the service's? something like:
throw new Error(`alias ${alias} is in use`); | |
throw new Error(`Alias '${alias}' is already in use by another load balancer '${recordSet[0].AliasTarget.DNSName}'`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update the error message to indicate that the alias "could" be another load balancer if the alias target isn't null.
It is still possible that the domain is in use because people manually added a, e.g. CNAME, record using the domain routing to an IP address or another domain (not necessarily an alias to another service).
}; | ||
|
||
try { | ||
await handler(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we Promise.race
the handler with a 13-14 min timeout so that we avoid the scenario where the custom resource is stuck for 3h:
Set reasonable timeout periods, and report when they're about to be exceeded
If an operation doesn't run within its defined timeout period, the function raises an exception and no response is sent to AWS CloudFormation.
https://aws.amazon.com/premiumsupport/knowledge-center/best-practices-custom-cf-lambda/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I was planning to add the lambda time out later since it introduces more testing logic into this PR, but I can do it now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added!
} | ||
}]; | ||
|
||
if (option.DomainName !== certificateDomain) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me to understand the domain name is indeed updated when any alias
is set right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the option.DomainName
would be the alias domain name.
The option
is given by acm
when it provides us with the validation options for each aliases. Each option should contain at least the following information:
- The domain it's validated (i.e. the alias)
- The validation record name
- The validation record value
554c7a6
to
e089133
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after addressing feedback below
…r NLB (#3070) The previous PR #3075 takes into consideration only the case where the aliases are environment-level (e.g. `a.env.app.domain.com`); however, they could also be app-level (e.g. `a.app.domain.com`) or root-level (e.g. `a.domain.com`). This PR extends the cases considered as such. In addition, the PR adds lazy-loading into the script. Previous PR: #3057 By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the Apache 2.0 License.
This PR contains only logic for CREATE action for env-level aliases. UPDATE and DELETE will be implemented in a following pull request.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the Apache 2.0 License.