-
Notifications
You must be signed in to change notification settings - Fork 3.3k
[Compute] az vmss reimage: Fix the bug that all instances will be reimaged after using --instance-id
#25477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Compute] az vmss reimage: Fix the bug that all instances will be reimaged after using --instance-id
#25477
Conversation
|
Compute |
| c.argument('instance_id', nargs='+', help='Space-separated list of VM instance ID. If missing, reimage all instances.') | ||
| c.argument('instance_id', nargs='+', deprecate_info=c.deprecate(target='--instance-id', redirect='--instance-ids', hide=True), | ||
| help='Space-separated list of VM instance ID. If missing, reimage all instances.') | ||
| c.argument('instance_ids', nargs='+', help='Space-separated list of VM instance ID. If missing, reimage all instances.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making --instance-id an alias of --instance--ids and deprecate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! Updated
|
|
||
|
|
||
| def reimage_vmss(cmd, resource_group_name, vm_scale_set_name, instance_id=None, no_wait=False): | ||
| def reimage_vmss(cmd, resource_group_name, vm_scale_set_name, instance_id=None, instance_ids=None, no_wait=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retaining both instance_id and instance_ids makes the code hard to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
|
|
||
| if instance_ids: | ||
| VirtualMachineScaleSetVMInstanceIDs = cmd.get_models('VirtualMachineScaleSetVMInstanceIDs') | ||
| instance_ids = VirtualMachineScaleSetVMInstanceIDs(instance_ids=instance_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to change the name if its type changes, in order to make the code easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use instance_ids as variable name just to keep consistent with the generally existing code style, such as
azure-cli/src/azure-cli/azure/cli/command_modules/vm/custom.py
Lines 3541 to 3558 in 917fba1
| def deallocate_vmss(cmd, resource_group_name, vm_scale_set_name, instance_ids=None, no_wait=False): | |
| client = _compute_client_factory(cmd.cli_ctx) | |
| if instance_ids and len(instance_ids) == 1: | |
| return sdk_no_wait(no_wait, client.virtual_machine_scale_set_vms.begin_deallocate, | |
| resource_group_name, vm_scale_set_name, instance_ids[0]) | |
| VirtualMachineScaleSetVMInstanceIDs = cmd.get_models('VirtualMachineScaleSetVMInstanceIDs') | |
| vm_instance_i_ds = VirtualMachineScaleSetVMInstanceIDs(instance_ids=instance_ids) | |
| return sdk_no_wait(no_wait, client.virtual_machine_scale_sets.begin_deallocate, | |
| resource_group_name, vm_scale_set_name, vm_instance_i_ds) | |
| def delete_vmss_instances(cmd, resource_group_name, vm_scale_set_name, instance_ids, no_wait=False): | |
| client = _compute_client_factory(cmd.cli_ctx) | |
| VirtualMachineScaleSetVMInstanceRequiredIDs = cmd.get_models('VirtualMachineScaleSetVMInstanceRequiredIDs') | |
| instance_ids = VirtualMachineScaleSetVMInstanceRequiredIDs(instance_ids=instance_ids) | |
| return sdk_no_wait(no_wait, client.virtual_machine_scale_sets.begin_delete_instances, | |
| resource_group_name, vm_scale_set_name, instance_ids) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, vm_instance_i_ds is certainly ugly.
| self.cmd('vmss create -g {rg} -n {vmss} --image ubuntults --instance-count 2') | ||
| self.cmd('vmss reimage -g {rg} -n {vmss} --instance-id 0 1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little bit curious how the test passed previously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At present, this is caused by the unstable behavior of REST service.
After the request virtual-machine-scale-sets/create-or-update is completed, if the request virtual-machine-scale-sets/reimage will be sent immediately, and the following error will occasionally occur
(InvalidParameter) The provided instanceId 0 is not an active Virtual Machine Scale Set VM instanceId.
Code: InvalidParameter
Message: The provided instanceId 0 is not an active Virtual Machine Scale Set VM instanceId.
Target: instanceIds
I guess this problem may be caused by the VM instance's state is not completely ready after the request virtual-machine-scale-sets/create-or-update is completed, because this problem can be avoided if the time interval between the reimage operations is longer
@grizzlytheodore Could you please take a look at this issue? Or you can ask the right person to look at it?
| - name: Reimage a VM instance within a VMSS. | ||
| text: | | ||
| az vmss reimage --instance-id 1 --name MyScaleSet --resource-group MyResourceGroup --subscription MySubscription | ||
| crafted: true | ||
| az vmss reimage --instance-ids 1 --name MyScaleSet --resource-group MyResourceGroup --subscription MySubscription | ||
| - name: Reimage a batch of VM instances within a VMSS. | ||
| text: | | ||
| az vmss reimage --instance-ids 1 2 3 --name MyScaleSet --resource-group MyResourceGroup --subscription MySubscription |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's provide an example for reimaging all instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, done
| instances = self.cmd('vmss list-instances -g {rg} -n {vmss}').get_output_in_json() | ||
| self.kwargs['instance_id1'] = instances[0]['instanceId'] | ||
| self.kwargs['instance_id2'] = instances[1]['instanceId'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
az vmss list-instances can still return 0, 1. If we pass 0, 1 to az vmss reimage, we may still see the failure (#25476).
The correct solution is for the computer service to guarantee all instances are ready to be reimaged after az vmss create.
|
CI randomly fails due to this threading issue (#25472). Let's merge this PR ASAP. |
…eimaged after using `--instance-id` (Azure#25477)
Related command
az vmss reimageDescription
The loop logic of reimaging instance from #25131 is wrong. We should use the batch operation provided by the Python SDK
virtual_machine_scale_sets.begin_reimage_all. Otherwise, if the process is not returned in time after the end of the loop logic, and all instances will be re-imaged unexpectedlyIn addition, in order to make the design style of parameter conform to the specification, add parameter
--instance-idsto replace original parameter--instance-idIssue Close: #25476
Testing Guide
History Notes
[Compute]
az vmss reimage: Fix the bug that all instances will be reimaged after using--instance-idand add new parameter--instance-idsto replace--instance-idThis checklist is used to make sure that common guidelines for a pull request are followed.
The PR title and description has followed the guideline in Submitting Pull Requests.
I adhere to the Command Guidelines.
I adhere to the Error Handling Guidelines.