Skip to content

Conversation

shikha372
Copy link
Contributor

@shikha372 shikha372 commented Mar 15, 2025

Issue # (if applicable)

Related to one of the related issue for EKS and VPC in #22025 .

Reason for this change

Currently EKS cluster with private endpoint access fails during cdk synth operation for a lookedup VPC.
investigating further into lookup implementation, the VPC id is first populated through some dummy values including the one for private subnets :

    {
      name: 'Private',
      type: cxapi.VpcSubnetGroupType.PRIVATE,
      subnets: [
        {
          availabilityZone: 'dummy1a',
          subnetId: 'p-12345',
          routeTableId: 'rtb-12345p',
          cidr: '1.2.3.4/5',
        },

But there are no dummy values defined for the case of privatesubnetIds :

privateSubnetIds: undefined,

which results in return of filtering by IDs option as a null object until the values are fully resolved.
Reference code: https://github.com/aws/aws-cdk/blob/main/packages/aws-cdk-lib/aws-ec2/lib/vpc.ts#L2705

Description of changes

using existing field isPendingLookup in the SubnetSelection which is being set on the basis of this.incompleteSubnetDefinition = isIncomplete; where isIncomplete is set to false during first pass of cdk synth.
So during first synth operation, validation will be skipped.

Describe any new or updated permissions being added

NA

Description of how you validated changes

Added unit test and integration test

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@github-actions github-actions bot added the p2 label Mar 15, 2025
@aws-cdk-automation aws-cdk-automation requested a review from a team March 15, 2025 01:03
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Mar 15, 2025
Copy link

codecov bot commented Mar 15, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.38%. Comparing base (8b23b5d) to head (89ecdf6).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #33786   +/-   ##
=======================================
  Coverage   82.38%   82.38%           
=======================================
  Files         120      120           
  Lines        6955     6955           
  Branches     1173     1173           
=======================================
  Hits         5730     5730           
  Misses       1120     1120           
  Partials      105      105           
Flag Coverage Δ
suite.unit 82.38% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
packages/aws-cdk ∅ <ø> (∅)
packages/aws-cdk-lib/core 82.38% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@shikha372 shikha372 force-pushed the shilkagg/fix-eks branch 2 times, most recently from 51d0172 to 83c35a0 Compare March 17, 2025 21:44
if (privateSubnets.length === 0 && publicAccessDisabled) {
// no private subnets and no public access at all, no good.
throw new Error('Vpc must contain private subnets when public endpoint access is disabled');
if (!isLookedUpVPC && !isSubnetIdFilter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that LookedUpVpc is always the type when using Vpc.fromLookup. Its just that on the first synth, its values are dummy ones. The condition you are proposing here will make us skip the validation entirely, but actually we just want to skip it on the first synth, right?

So the condition needs to be based on the state of the VPC, not on its type.

There a couple other options to address this issue I think:

Subnet Selection

We already have a flag on SubnetSelection that tells us whether a lookup is pending:

readonly isPendingLookup?: boolean;

SubnetSelection is used extensively in the EKS module. I don't have it all laid out, but can we leverage this to determine we don't need this validation?

incompleteSubnetDefinition

VpcBase has a field indicating whether we should validate on subnets or not:

/**
* If this is set to true, don't error out on trying to select subnets
*/
protected incompleteSubnetDefinition: boolean = false;

Its not currently public though, but worst case, we can discuss making it so.

Copy link
Contributor Author

@shikha372 shikha372 Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't know about it, its pretty useful and solves this problem, refactored code to use isPendingLookup, which returns false during dummyValue which i believe is due to dummy value being undefined under getValue (please correct me if the understanding is not correct), which results in LookedUpVpc with incompleteSubnetDefinition to true

const attributes: cxapi.VpcContextResponse = ContextProvider.getValue(scope, {
      provider: cxschema.ContextProvider.VPC_PROVIDER,
      props: {
        ...overrides,
        filter,
        returnAsymmetricSubnets: true,
        returnVpnGateways: options.returnVpnGateways,
        subnetGroupNameTag: options.subnetGroupNameTag,
      } as cxschema.VpcContextQuery,
      dummyValue: undefined,
    }).value;

new LookedUpVpc(scope, id, attributes ?? DUMMY_VPC_PROPS, attributes === undefined);

constructor(scope: Construct, id: string, props: cxapi.VpcContextResponse, isIncomplete: boolean)
this.incompleteSubnetDefinition = isIncomplete;

We don't need to make incompleteSubnetDefinition public as we can fetch the corresponding isPendingLookup from selectSubnets method

@iliapolo
Copy link
Contributor

iliapolo commented Mar 19, 2025

@shikha372 Please also amend the PR title to briefly describe what the issue is. Avoid generic terms like "fixing".

For example: fix(eks): looked up vpc causing premature validation errors.

@shikha372 shikha372 force-pushed the shilkagg/fix-eks branch 2 times, most recently from c207f2e to e112c7f Compare March 19, 2025 22:52
@shikha372 shikha372 changed the title fix(eks): fixing the filtering by subnetIDs option fix(eks): looked up vpc causing premature validation errors for private subnets Mar 19, 2025
@shikha372 shikha372 marked this pull request as ready for review March 20, 2025 00:04
diffAssets: false,
});

app.synth();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this was intentional, but a note would be good to make your intention clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this was not intentional, but i don't think its an issue as we don't need it anymore for integration tests.

const stack = new cdk.Stack(app, 'StackWithVpc', {
env: {
region: 'eu-west-1',
account: '123456',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain how this successfully deploys? sorry I might just not be familiar with the magic that replaces/overrides this at deploy time, and why is this needed?

});

// Look up an existing VPC
const vpc = ec2.Vpc.fromLookup(stack, 'Vpc', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand how this test works...the VPC you are looking up gets deployed in this same stack, so during lookup (synthesis), the VPC does not exist.

My guess is that the CLI just gives up and proceeds without fetching lookup values:

https://github.com/aws/aws-cdk-cli/blob/2dc5a564d8225f99063afd80bf4067770b0b0291/packages/aws-cdk/lib/api/cxapp/cloud-executable.ts#L81-L83

Can you confirm? You probably need to separate stacks here. Though I actually think an integ test is not really required here, the unit tests seem sufficient - can you elaborate on what additional value the integ tests give you?

Copy link
Contributor Author

@shikha372 shikha372 Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually i was referring for existing VPC lookup test here , I tried to model it with two stacks but seems when we set enableLookups in integ test it is not honoring the stack dependency and fails with the value not found.

// Add explicit dependency to ensure VPC stack is deployed first

clusterStack.addDependency(vpcStack);

new integ.IntegTest(app, 'aws-cdk-eks-cluster-private-endpoint-subnet-filter', {
    testCases: [vpcStack],
    // Test includes assets that are updated weekly. If not disabled, the upgrade PR will fail.
    diffAssets: false,
    enableLookups: true,
    stackUpdateWorkflow: false,
}); 

But i;ve tested the validation with a sample CDK stack application linked to update aws-cdk module with fix;

export class CdkVpcAppStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const vpc = ec2.Vpc.fromLookup(this, 'Vpc', { 
      vpcName: 'aws-cdk-eks-cluster-private-endpoint-test',
      isDefault: false });
     
    new eks.Cluster(this, 'eks-cluster', {
      vpc,
      version: eks.KubernetesVersion.V1_31,
      kubectlLayer: new KubectlV31Layer(this, 'KubectlLayer'),
      endpointAccess: eks.EndpointAccess.PRIVATE,
      vpcSubnets: [
        {
          // subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS,
          subnetFilters: [
            ec2.SubnetFilter.byIds(['subnet-XXXXX']),
          ],
        }
      ]
    });
  }
}

@shikha372 shikha372 force-pushed the shilkagg/fix-eks branch 2 times, most recently from aa1175e to dababbd Compare March 24, 2025 23:08
Copy link
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This review is outdated)

@shikha372 shikha372 added the pr-linter/exempt-integ-test The PR linter will not require integ test changes label Mar 26, 2025
@aws-cdk-automation aws-cdk-automation dismissed their stale review March 26, 2025 17:43

✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.

@godwingrs22 godwingrs22 self-assigned this Mar 26, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should also perform below validation here only if there is no pending lookups? Is it applicable?

    if (placeClusterHandlerInVpc && privateSubnets.length === 0) {
      throw new Error('Cannot place cluster handler in the VPC since no private subnets could be selected');
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@godwingrs22 godwingrs22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shikha372 shikha372 requested a review from iliapolo March 26, 2025 21:14
@shikha372 shikha372 dismissed iliapolo’s stale review March 27, 2025 17:30

addressed requested changes and reviewed with abstractions team for approval

Copy link
Contributor

mergify bot commented Mar 27, 2025

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@shikha372
Copy link
Contributor Author

@Mergifyio update

Copy link
Contributor

mergify bot commented Mar 27, 2025

update

☑️ Nothing to do

  • #commits-behind > 0 [📌 update requirement]
  • queue-position = -1 [📌 update requirement]
  • -closed [📌 update requirement]
  • -conflict [📌 update requirement]

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 89ecdf6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Contributor

mergify bot commented Mar 27, 2025

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@mergify mergify bot merged commit 73744b4 into aws:main Mar 27, 2025
20 checks passed
Copy link
Contributor

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 27, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

contribution/core This is a PR that came from AWS. p2 pr-linter/exempt-integ-test The PR linter will not require integ test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants