(aws-eks): Neuron device plugin is not installed when instance type is Trainium #29131
Labels
@aws-cdk/aws-eks
Related to Amazon Elastic Kubernetes Service
bug
This issue is a bug.
effort/medium
Medium work item – several days of effort
p2
Describe the bug
if instance type is Trainium the neuron device plugin is wrongfully not installed
Expected Behavior
if instance type is Trainium the neuron device plugin is installed
Current Behavior
if instance type is Trainium the neuron device plugin is NOT installed
Reproduction Steps
use an instance of type Trainium
Possible Solution
No response
Additional Information/Context
Instance types of family Trainium have recently been added here: https://github.com/aws/aws-cdk/blame/main/packages/aws-cdk-lib/aws-ec2/lib/instance-types.ts
BUT:
[packages/aws-cdk-lib/aws-eks/lib/instance-types.ts] does not include them:
export const INSTANCE_TYPES = {
gpu: ['p2', 'p3', 'g2', 'g3', 'g4'],
inferentia: ['inf1', 'inf2'],
graviton: ['a1'],
graviton2: ['c6g', 'm6g', 'r6g', 't4g'],
graviton3: ['c7g'],
};
causing the check in packages/aws-cdk-lib/aws-eks/lib/cluster.ts to fail and the plugin not being installed:
function nodeTypeForInstanceType(instanceType: ec2.InstanceType) {
return INSTANCE_TYPES.gpu.includes(instanceType.toString().substring(0, 2)) ? NodeType.GPU :
INSTANCE_TYPES.inferentia.includes(instanceType.toString().substring(0, 4)) ? NodeType.INFERENTIA :
NodeType.STANDARD;
}
public addNodegroupCapacity(id: string, options?: NodegroupOptions): Nodegroup {
const hasInferentiaInstanceType = [
options?.instanceType,
...options?.instanceTypes ?? [],
].some(i => i && nodeTypeForInstanceType(i) === NodeType.INFERENTIA);
if (hasInferentiaInstanceType) {
this.addNeuronDevicePlugin();
}
...
CDK CLI Version
2.128.0
Framework Version
No response
Node.js Version
v21.6.1
OS
sonoma 14.3
Language
TypeScript
Language Version
No response
Other information
No response
The text was updated successfully, but these errors were encountered: