Skip to content

Conversation

@zhoxing-ms
Copy link
Contributor

@zhoxing-ms zhoxing-ms commented Feb 16, 2023

Related command

az vmss reimage

Description

The loop logic of reimaging instance from #25131 is wrong. We should use the batch operation provided by the Python SDK virtual_machine_scale_sets.begin_reimage_all. Otherwise, if the process is not returned in time after the end of the loop logic, and all instances will be re-imaged unexpectedly

In addition, in order to make the design style of parameter conform to the specification, add parameter --instance-ids to replace original parameter --instance-id

Issue Close: #25476

Testing Guide

History Notes

[Compute] az vmss reimage: Fix the bug that all instances will be reimaged after using --instance-id and add new parameter --instance-ids to replace --instance-id


This checklist is used to make sure that common guidelines for a pull request are followed.

@ghost ghost requested a review from yonzhan February 16, 2023 03:54
@ghost ghost added the Auto-Assign Auto assign by bot label Feb 16, 2023
@ghost ghost assigned zhoxing-ms Feb 16, 2023
@ghost ghost added this to the Feb 2023 (2023-03-07) milestone Feb 16, 2023
@ghost ghost added the Compute az vm/vmss/image/disk/snapshot label Feb 16, 2023
@yonzhan
Copy link
Collaborator

yonzhan commented Feb 16, 2023

Compute

yanzhudd
yanzhudd previously approved these changes Feb 16, 2023
c.argument('instance_id', nargs='+', help='Space-separated list of VM instance ID. If missing, reimage all instances.')
c.argument('instance_id', nargs='+', deprecate_info=c.deprecate(target='--instance-id', redirect='--instance-ids', hide=True),
help='Space-separated list of VM instance ID. If missing, reimage all instances.')
c.argument('instance_ids', nargs='+', help='Space-separated list of VM instance ID. If missing, reimage all instances.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making --instance-id an alias of --instance--ids and deprecate it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! Updated



def reimage_vmss(cmd, resource_group_name, vm_scale_set_name, instance_id=None, no_wait=False):
def reimage_vmss(cmd, resource_group_name, vm_scale_set_name, instance_id=None, instance_ids=None, no_wait=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retaining both instance_id and instance_ids makes the code hard to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


if instance_ids:
VirtualMachineScaleSetVMInstanceIDs = cmd.get_models('VirtualMachineScaleSetVMInstanceIDs')
instance_ids = VirtualMachineScaleSetVMInstanceIDs(instance_ids=instance_ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to change the name if its type changes, in order to make the code easier to read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use instance_ids as variable name just to keep consistent with the generally existing code style, such as

def deallocate_vmss(cmd, resource_group_name, vm_scale_set_name, instance_ids=None, no_wait=False):
client = _compute_client_factory(cmd.cli_ctx)
if instance_ids and len(instance_ids) == 1:
return sdk_no_wait(no_wait, client.virtual_machine_scale_set_vms.begin_deallocate,
resource_group_name, vm_scale_set_name, instance_ids[0])
VirtualMachineScaleSetVMInstanceIDs = cmd.get_models('VirtualMachineScaleSetVMInstanceIDs')
vm_instance_i_ds = VirtualMachineScaleSetVMInstanceIDs(instance_ids=instance_ids)
return sdk_no_wait(no_wait, client.virtual_machine_scale_sets.begin_deallocate,
resource_group_name, vm_scale_set_name, vm_instance_i_ds)
def delete_vmss_instances(cmd, resource_group_name, vm_scale_set_name, instance_ids, no_wait=False):
client = _compute_client_factory(cmd.cli_ctx)
VirtualMachineScaleSetVMInstanceRequiredIDs = cmd.get_models('VirtualMachineScaleSetVMInstanceRequiredIDs')
instance_ids = VirtualMachineScaleSetVMInstanceRequiredIDs(instance_ids=instance_ids)
return sdk_no_wait(no_wait, client.virtual_machine_scale_sets.begin_delete_instances,
resource_group_name, vm_scale_set_name, instance_ids)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, vm_instance_i_ds is certainly ugly.

Comment on lines -8733 to -8734
self.cmd('vmss create -g {rg} -n {vmss} --image ubuntults --instance-count 2')
self.cmd('vmss reimage -g {rg} -n {vmss} --instance-id 0 1')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little bit curious how the test passed previously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At present, this is caused by the unstable behavior of REST service.

After the request virtual-machine-scale-sets/create-or-update is completed, if the request virtual-machine-scale-sets/reimage will be sent immediately, and the following error will occasionally occur

(InvalidParameter) The provided instanceId 0 is not an active Virtual Machine Scale Set VM instanceId.
Code: InvalidParameter
Message: The provided instanceId 0 is not an active Virtual Machine Scale Set VM instanceId.
Target: instanceIds

I guess this problem may be caused by the VM instance's state is not completely ready after the request virtual-machine-scale-sets/create-or-update is completed, because this problem can be avoided if the time interval between the reimage operations is longer

@grizzlytheodore Could you please take a look at this issue? Or you can ask the right person to look at it?

Comment on lines +3400 to +3405
- name: Reimage a VM instance within a VMSS.
text: |
az vmss reimage --instance-id 1 --name MyScaleSet --resource-group MyResourceGroup --subscription MySubscription
crafted: true
az vmss reimage --instance-ids 1 --name MyScaleSet --resource-group MyResourceGroup --subscription MySubscription
- name: Reimage a batch of VM instances within a VMSS.
text: |
az vmss reimage --instance-ids 1 2 3 --name MyScaleSet --resource-group MyResourceGroup --subscription MySubscription
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's provide an example for reimaging all instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, done

Comment on lines +8729 to +8731
instances = self.cmd('vmss list-instances -g {rg} -n {vmss}').get_output_in_json()
self.kwargs['instance_id1'] = instances[0]['instanceId']
self.kwargs['instance_id2'] = instances[1]['instanceId']
Copy link
Member

@jiasli jiasli Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

az vmss list-instances can still return 0, 1. If we pass 0, 1 to az vmss reimage, we may still see the failure (#25476).

The correct solution is for the computer service to guarantee all instances are ready to be reimaged after az vmss create.

@jiasli
Copy link
Member

jiasli commented Feb 21, 2023

CI randomly fails due to this threading issue (#25472). Let's merge this PR ASAP.

@zhoxing-ms zhoxing-ms merged commit 682a4dc into Azure:dev Feb 21, 2023
avgale pushed a commit to avgale/azure-cli that referenced this pull request Aug 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Auto-Assign Auto assign by bot Compute az vm/vmss/image/disk/snapshot

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[vm] test_vmss_reimage_instance_id fails during live run: The provided instanceId 0 is not an active Virtual Machine Scale Set VM instanceId.

4 participants