Skip to content

Conversation

@derekbekoe
Copy link
Member

@derekbekoe derekbekoe commented Oct 11, 2016

Closes #835

This commit implements this for the following modules only:

  • component
  • vm

(other modules to come in later commits)

Although the PR is DO NOT MERGE, it is open for comments before I extend this strategy out to other command modules.
After completing the change for other command modules, I'll remove the DO NOT MERGE.


This change is Reviewable

Copy link
Member

@tjprescott tjprescott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give a brief overview of the main changes and/or update the command authoring doc so we can see what these changes would mean as far as authoring new commands? It seems like these changes trade increases in speed for a lot of small multi-line callables that could make it more difficult for a third party just trying to add their new commands.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't this an optimization to speed up loading times? It seems removing it would slow down a bunch of cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This optimization also put a limitation on how commands are loaded.
It meant that commands from other modules could not contribute to a top-level command because of this shortcut.

If the overhead of loading all modules is not high, it makes sense to do this and that's the aim of this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I understand it: we load all modules, but they are bare bones loading that pretty much just loads the command names and the module they are in. Then, based on which command we are executing, we load the params file for ONLY the module that that command is in. So while we build the whole command table, most of it is essentially empty. And then we only load the client factory for the one command we are executing, only when we execute it. Is that close?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's right.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we attempting to load only the specific command we are running instead of all modules or a single module?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We get commands from all modules.
But only load the params for the module with the command we are about to execute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why load all commands from all modules? The only time we need multiple commands from any module is if you specify --help/-h on a group.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the user runs az, we need to have loaded the commands from all modules to show the list of available commands and command groups.

Also, as command groups can be made up of commands from multiple packages, we need to load all commands from all modules to know what's available.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#Resolved

Copy link
Member

@tjprescott tjprescott Oct 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this comment means. Don't we already have the command (from line 101)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check at 101 is rudimentary and doesn't do everything that argparse does.
If self.parser.parse_args() returns, we know it parsed successfully so we can get the command from argparse.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this logic now.
#Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we load_params and load_command_table up above and down here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ones above are only for help of if completer is enabled.
I'll change the logic to make it clearer.

We call load_command_table again though because load_params will have loaded some changes that we need to pass to the parser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here has been changed.
#Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't whether the description needs to be loaded depend on whether help is requested and not whether a completer is active?

Copy link
Member Author

@derekbekoe derekbekoe Oct 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When help is requested, it's too late to dynamically try and load the description as the command table has already been passed to the parser.

With this approach, the description is only loaded when a command is actually executing so the overhead (about 0.1/0.2 secs) is small in comparison to whatever the command is doing.

With tab completion, we don't need the description so can remove this overhead.

That was the thinking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we prescreen for the help flag (at least we do for az -h)? If so we would know ahead of time whether the user is asking for help and if not, we wouldn't need to load the description at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to link to this issue that was created.
#981

If we want to support that, we need to be able to load the description even without --help / -h specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but I think you could do that with a table cache:

command type
az vm group
az vm create command

Then as part of that pre-screening, if it is a group or a command with help requested, you load the description stuff.

Of course, this only make sense if the time savings justify the work and you are in the best position to know that, not me. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the command to module map stored?

Copy link
Member Author

@derekbekoe derekbekoe Oct 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's dynamically created on each invocation (not persistent).
See azure/cli/core/commands/init.py command_module_map = {}.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have to load all command modules to build the command to module map? Does it then cache the result to file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result isn't cached to file.
It's used so that when we come to load the params for a command, we know which module to load the params for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to the trouble of restructuring all command modules for speed, why would we not cache this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caching has downsides:

  • cache can become stale
  • how do we keep the cache correct
  • when do we refresh it
  • where do we save the cache

I didn't want to use caching unless we really have to (e.g. if tab completion is still considered too slow after this change).

For users that don't use tab completion, the cache would have a minimal benefit as a network request can take 0.5 seconds anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean network caching. I mean why not save that command table (the bare bones now that basically maps the command names to their module) as a file on their system. If present it can just look up the command, and load that module etc instead of building everything. The mapping would only change for normal users if they updated their modules (which would be the obvious point to wipe that file if it exists) or in a dev scenario (where you could just disable caching entirely).

I do agree though, if rebuilding is "fast enough" then there's no point in doing the extra work. Of course, what is fast enough ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it called *_with_handler? Is it just temporary to distinguish it from the regular cli_command method name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was temporary so the 'old way' will still work as I haven't changed everything over.
Once done, we'll just have cli_command() that behaves in the new way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed cli_command_with_handler and just modified cli_command instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is op_lazy a temporary variable that will go away once the transition is complete?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly. Once the transition is complete, all commands will be loaded in this lazy fashion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the op_lazy param.
#Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a lot of tedious boilerplate code. Could we not specify the operation by a string and then import it lazily within the core execution rather than with a callable for each command?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a prototype that did use strings with imports.
e.g.

cli_command('my command', lambda: import_module('azure.mgmt.compute.operation', 'create_or_update'))

Issue with this is:

  • import syntax changes as you can't have the from X import Y.
  • IntelliSense/PyLint would not be able to run any checks or help as our imports become strings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or alternatively:
cli_command('my command', 'azure.mgmt.compute.operation', 'create_or_update')
That would be cleaner. I get your point about intellisense and... ... pylint. 😐 (although VSCode seems to go crazy with red highlights around imports anyways)

This is the one optimization I think we should brainstorm on because it changes the way people have to onboard commands and makes it more obtuse (the rest of the changes would not be exposed to command authors).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//cc: @johanste

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change this to use the string imports

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now using strings for imports
#Resolved

@tjprescott
Copy link
Member

Also, what are the speed gains with this approach?

@derekbekoe
Copy link
Member Author

Perf improvements once all modules have been converted:

  • Tab completion is faster (0.5 secs -> 0.2 secs)
  • The CLI loads faster overall (1.5 secs -> 0.4 secs)

The best thing about this change is that as we add more command modules, overall performance won't take as much of a hit as we do right now.

Main reason is because we defer the importing/loading of command handlers until when the command is actually going to be invoked. Also, params are loaded until needed and our params use the SDK heavily for choice lists, etc.

@derekbekoe
Copy link
Member Author

I'll update the command authoring docs also once this is finalized.

@tjprescott
Copy link
Member

The command authoring docs changes would help with the review as it is hard to fully understand what onboarding commands used to look like compared to what it would look like now. The new way seems to be a lot more obtuse in the way you add commands.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2nd the is redundant

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this file should be named as __command_handlers.py or so. The factory usually means something which produces other types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//cc: @johanste

@derekbekoe
Copy link
Member Author

@tjprescott, @yugangw-msft, @johanste
I've made the suggested change to remove _operation_factory.py files.

Now, we use strings to define the handler for a command.
For example:

cli_command(__name__, 'vm show', 'azure.mgmt.compute.operations.virtual_machines_operations#VirtualMachinesOperations.get', cf_vm)

Please take a look.

Copy link
Member

@tjprescott tjprescott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more questions, but overall better I think.

- choice lists
- completers

**Create an __init__.py for your module**
Copy link
Member

@tjprescott tjprescott Oct 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be in the document for creating a new module rather than creating new commands? #Resolved

2. Register your command using the `cli_command` (or similar) function.
3. Write up your command's help entry.
4. Use the `register_cli_argument` function to add the following enhancements to your arguments, as needed:
1. Create an __init__.py file for your command module.
Copy link
Member

@tjprescott tjprescott Oct 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double underline is markup for bold, so __init__.py does not display correctly. #Resolved

- `module_name` - The name of the module that is registering the command (e.g. `azure.cli.command_modules.vm.commands`). Typically this will be `__name__`.
- `name` - String uniquely naming your command and placing it within the command hierachy. It will be the string that you would type at the command line, omitting `az` (ex: access your command at `az mypackage mycommand` using a name of `mypackage mycommand`).
- `operation` - Your function's name.
- `operation` - The handler that will be executed. Format is `<module_to_import>#<attribute_list>`
Copy link
Member

@tjprescott tjprescott Oct 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work the same with custom commands? I imagine many people might pass the method "the old way" so do we check that operation is a string and throw a useful error message if they pass a function? #Resolved

cli_command(__name__, 'vm create', 'azure.cli.command_modules.vm.mgmt_vm.lib.operations.vm_operations#VmOperations.create_or_update', cf_vm_create,
transform=DeploymentOutputLongRunningOperation('Starting vm create'))

cli_command(__name__, 'vm delete', 'azure.mgmt.compute.operations.virtual_machines_operations#VirtualMachinesOperations.delete', cf_vm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could clean these up, i.e. extract the common parts of the strings into some kind of helper method. Then again, it might just produce something that isn't noticeably shorter, just different...

Overall I like this much better. If the dev put a typo in the string, would it still error out when they tried to run a command? In other words, would they still get early warning of a problem, or would it only throw if they tried to run THAT specific command? If so, I would have greater concern about the potential of typos in these strings.

Copy link
Member Author

@derekbekoe derekbekoe Nov 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running az will show any errors if there's a mistake so we're good here.

@derekbekoe derekbekoe changed the title [DO NOT MERGE] Faster command module loading & tab completion performance Faster command module loading & tab completion performance Nov 1, 2016
@derekbekoe derekbekoe force-pushed the improve-perf-2 branch 10 times, most recently from e463a13 to 3c50c55 Compare November 8, 2016 22:08
@derekbekoe derekbekoe merged commit d7dec55 into Azure:master Nov 9, 2016
@derekbekoe derekbekoe deleted the improve-perf-2 branch November 9, 2016 22:28
00Kai0 pushed a commit to 00Kai0/azure-cli that referenced this pull request Apr 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants