Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(custom-resource): implement lambda function that validate certificate and add aliases for NLB #3057

Merged
merged 19 commits into from
Nov 23, 2021

Conversation

Lou1415926
Copy link
Contributor

@Lou1415926 Lou1415926 commented Nov 18, 2021

This PR contains only logic for CREATE action for env-level aliases. UPDATE and DELETE will be implemented in a following pull request.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the Apache 2.0 License.

@Lou1415926 Lou1415926 requested a review from a team as a code owner November 18, 2021 21:11
@Lou1415926 Lou1415926 requested review from efekarakus and removed request for a team November 18, 2021 21:11
@Lou1415926 Lou1415926 changed the title chore(custom-resource) implement lambda function that validate certificate and add aliases for NLB chore(custom-resource): implement lambda function that validate certificate and add aliases for NLB Nov 18, 2021
Copy link
Contributor

@iamhopaul123 iamhopaul123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall! Just a reminder even for create we still need to handle the case when alias is in app/root hosted zone.

const { CertificateArn } =await acm.requestCertificate({
DomainName: certificateDomain,
IdempotencyToken: physicalResourceID,
SubjectAlternativeNames: aliases.size === 0? null: [...aliases],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it throw an error if we pass in an empty array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SubjectAlternativeNames needs to either be specified with an array with length >= 1, or not specified at all (by not including the key, or having null as its value). So it'll throw an error if we pass in an empty array, but it won't if we pass in a null.

async function waitForValidationOptionsToBeReady(certificateARN, aliases) {
let expectedCount = aliases.size + 1; // Expect one validation option for each alias and the cert domain.

let attempt; // TODO: This wait loops could be further abstracted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess when we add update/delete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes in a following PR - especially because DELETE also needs to have a wait loop like this. I think it may improve code's readability if this can be abstracted.

cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
if (!recordSet || recordSet.length === 0) {
continue; // The alias record does not exist in the hosted zone, hence valid.
}
if (recordSet[0].AliasTarget && recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (recordSet[0].AliasTarget && recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) {
if (recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an ECMA 2020 syntax (by default we are using ECMA 5 I think). I tried enabling the ECMA 2020 parser but it still fails to acknowledge the optional chaining syntax. I will use this syntax after I figure out how to really enable ECMA 2020 :)

if (recordSet[0].AliasTarget && recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) {
continue; // The record is an alias record and is in use by myself, hence valid.
}
throw new Error(`alias ${alias} is in use`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe alias is in used by xxx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure if we should include the record value or alias target in the error message 🤔 If the custom domain is in use, it's possible that it's in use by another service (hence the alias target is a copilot LB's DNS); however, it's also possible that the record is a non-alias record, pointing to whatever address.

@Lou1415926 Lou1415926 force-pushed the nlb/lambda branch 2 times, most recently from 1f03dfd to b4225e4 Compare November 19, 2021 18:27
cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
certificateDomain = `${serviceName}-nlb.${envName}.${appName}.${domainName}`;

let aliasesSorted = [...aliases].sort().join(",");
const physicalResourceID = `/${serviceName}/${aliasesSorted}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Do you know if there is a limit on the length of the physical resource ID? Would you mind investigating that, I tried a quick search but couldn't find anything might be worthwhile experimenting.

  2. Using the aliases in the resource ID makes total sense because we want to delete the old certificate right if the alias changes? if so can we add that as a short comment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer 2.:
Yes. For each unique combination of [service name + alias names], the [certificate ('s subject alternative name) + validation records + A records] needed should be the same. If there is no change in [service name + alias names], then the certificate shouldn't change, neither do the validation records and the A records; on the contrary, if a change is detected in [service name + alias names], then a replacement surely needs to happen to at least one of the three resources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline but I'm posting here so that this info is shared:
Answering 1.: There is virtually no limit on Physical Resource ID length - tried a 800-character long ID, and it was fine. However, if the ID is too long, the response object sent from the lambda will be too big, causing the custom resource to fail.

if (recordSet[0].AliasTarget && recordSet[0].AliasTarget.DNSName === `${loadBalancerDNS}.`) {
return; // The record is an alias record and is in use by myself, hence valid.
}
throw new Error(`alias ${alias} is in use`);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also add additional info for which LB uses it instead of the service's? something like:

Suggested change
throw new Error(`alias ${alias} is in use`);
throw new Error(`Alias '${alias}' is already in use by another load balancer '${recordSet[0].AliasTarget.DNSName}'`);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update the error message to indicate that the alias "could" be another load balancer if the alias target isn't null.

It is still possible that the domain is in use because people manually added a, e.g. CNAME, record using the domain routing to an IP address or another domain (not necessarily an alias to another service).

cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
};

try {
await handler();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we Promise.race the handler with a 13-14 min timeout so that we avoid the scenario where the custom resource is stuck for 3h:

Set reasonable timeout periods, and report when they're about to be exceeded
If an operation doesn't run within its defined timeout period, the function raises an exception and no response is sent to AWS CloudFormation.
https://aws.amazon.com/premiumsupport/knowledge-center/best-practices-custom-cf-lambda/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I was planning to add the lambda time out later since it introduces more testing logic into this PR, but I can do it now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

}
}];

if (option.DomainName !== certificateDomain) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me to understand the domain name is indeed updated when any alias is set right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the option.DomainName would be the alias domain name.

The option is given by acm when it provides us with the validation options for each aliases. Each option should contain at least the following information:

  1. The domain it's validated (i.e. the alias)
  2. The validation record name
  3. The validation record value

@efekarakus efekarakus added the do-not-merge Pull requests that mergify shouldn't merge until the requester allows it. label Nov 22, 2021
Copy link
Contributor

@efekarakus efekarakus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :shipit: after addressing feedback below

cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
cf-custom-resources/lib/nlb-cert-validator-updater.js Outdated Show resolved Hide resolved
@efekarakus efekarakus added do-not-merge Pull requests that mergify shouldn't merge until the requester allows it. and removed do-not-merge Pull requests that mergify shouldn't merge until the requester allows it. labels Nov 22, 2021
@mergify mergify bot merged commit df3c7a6 into aws:mainline Nov 23, 2021
mergify bot pushed a commit that referenced this pull request Nov 24, 2021
…r NLB (#3070)

The previous PR #3075 takes into consideration only the case where the aliases are environment-level (e.g. `a.env.app.domain.com`); however, they could also be app-level (e.g. `a.app.domain.com`) or root-level (e.g. `a.domain.com`). This PR extends the cases considered as such.

In addition, the PR adds lazy-loading into the script.

Previous PR: #3057 

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the Apache 2.0 License.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants