Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve module specifiers for auto imports in completion list (in incomplete chunks) #44713

Merged

Conversation

andrewbranch
Copy link
Member

@andrewbranch andrewbranch commented Jun 23, 2021

Fixes #42005

Summary

  • Before May 2020: auto imports showed no secondary label in completion lists. They had a different icon, but you had to focus them to really see that they were auto imports at all, much less where they were going to come from.
  • May 2020–present: auto imports showed a workspace-relative path to a module that exports the symbol. If the symbol was re-exported from multiple places, the path shown may not be the path actually imported from. Also, path mappings, node_modules paths, index file paths, etc. were not taken into account—it was literally just a relative path, not a module specifier.
  • This PR: it's too slow to resolve module specifiers unconditionally for every auto import in an arbitrarily long completion list, so when you first trigger completions, we compute up to 100 module specifiers, and try the module specifier cache up to an additional 1000 times, and fill them into the completion list. Entries beyond what we can get module specifiers for will continue to show up with the old module path (though I’m open to tweaking their display in a way that doesn’t impact perf in another PR). Then, as you continue to type characters, if any entries were left with unresolved module specifiers VS Code will re-request completions. We pull the previous response from a cache and fill in another 100/1000 module specifiers, but only for entries that still match the identifier you’re typing—a large chunk will get filtered out with every request. These continuation requests are very fast—58 ms max in my tests—and they usually fully complete the list within one or two requests.

Screen capture 🎥

You can see a smattering of fully resolved module specifiers in the initial list here, but some, like changeDepenciesStateTo0 still say node_modules. But once I’ve type enough characters to find it without scrolling, its label has updated to be its full module specifier.

Kapture.2021-06-24.at.10.41.26.mp4

It occurred to me while making this recording that it would probably be better to try resolving module specifiers in alphabetical order (instead of module traversal order), so all the resolved ones would be more toward the top of the list. However, this may not matter as much as it feels like it would since VS Code by default snaps you to names you recently completed elsewhere. So in the real world, you might be snapped to an essentially random spot in the list instead of near the top, depending on your settings. But I’ll think about whether the order can be smarter in some way as a future optimization.

Significant related changes

  • Ambient modules don’t count against the resolution limit—they always get fully resolved in the first go now, because they’re very cheap to resolve and we notice that they’re ambient while putting them in the export map cache. This applies even to older clients who otherwise can’t use this feature because they don’t signal that they can support incomplete responses/retriggerings.

  • As part of the way we can keep the performance cost of this down, the fuzzy matching logic for auto imports has changed from “the symbol name contains the typed characters, in order, anywhere in it” to a slightly more restrictive version, akin to what was used for import statement completions in Import statement completions #43149. I can’t do better to explain it than I did in the doc comment:

    True if the first character of lowercaseCharacters is the first character of some "word" in identiferString (where the string is split into "words" by camelCase and snake_case segments), then if the remaining characters of lowercaseCharacters appear, in order, in the rest of identifierString.

    True:
    'state' in 'useState'
    'sae' in 'useState'
    'viable' in 'ENVIRONMENT_VARIABLE'

    False:
    'staet' in 'useState'
    'tate' in 'useState'
    'ment' in 'ENVIRONMENT_VARIABLE'

Performance measurements

In a test project containing lots of expensive dependencies (especially aws-sdk):

type 'c' type 'o' delete word type 'c' type 'r'
4.3.2 405 ms - - 136 ms -
PR 415 ms 58 ms - 220 ms 20 ms

I’m not thrilled with 136 → 220 for the cached scenario, but that number is only so low in the first place because of cache improvements that have been made over the last year or two, and it’s still low in absolute terms for a completion list that large. I plan to continue optimizations here as needed, but I’m ok with these numbers considering the value added.

@typescript-bot
Copy link
Collaborator

Thanks for the PR! It looks like you've changed the TSServer protocol in some way. Please ensure that any changes here don't break consumers of the current TSServer API. For some extra review, we'll ping @sheetalkamat, @amcasey, @mjbvz, @minestarks for you. Feel free to loop in other consumers/maintainers if necessary

@typescript-bot typescript-bot added Author: Team For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Jun 23, 2021
@typescript-bot typescript-bot added For Milestone Bug PRs that fix a bug with a specific milestone and removed For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Jun 24, 2021
@andrewbranch andrewbranch marked this pull request as ready for review June 25, 2021 18:02
}
return wrapped;

function replaceTransientSymbols(info: SymbolExportInfo, checker: TypeChecker) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transient symbols should never have been in this cache, but this isn’t new, it’s just moved and renamed. I’m working on a follow-up to restructure this cache to be safer and less kludgy.

@andrewbranch andrewbranch requested a review from sandersn June 25, 2021 20:25
Copy link
Member

@sandersn sandersn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good based on my understanding -- thanks for walking through the code with me.

let computedWithoutCacheCount = 0;
const fixes = flatMap(moduleSymbols, exportInfo => {
const { computedWithoutCache, moduleSpecifiers } = getModuleSpecifiers(exportInfo.moduleSymbol);
computedWithoutCacheCount += Number(computedWithoutCache);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boolean -> Number is a liiitle too clever for my taste (but might be great in practice I don't know)

if (strChar === testChar || strChar === toUpperCharCode(testChar)) {
matchedFirstCharacter ||=
prevChar === undefined || // Beginning of word
CharacterCodes.a <= prevChar && prevChar <= CharacterCodes.z && CharacterCodes.A <= strChar && strChar <= CharacterCodes.Z || // camelCase transition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

future work: maybe use general unicode categories to detect lower -> uppercase transition

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems important, e.g. for languages with accents (I assume languages without lettercase use snake case?).

@andrewbranch andrewbranch merged commit 328e888 into microsoft:main Jun 25, 2021
@andrewbranch andrewbranch deleted the feature/auto-import-module-specifiers branch June 25, 2021 22:27
Copy link
Member

@amcasey amcasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The protocol change LGTM and the explanation makes sense.

*
* True:
* 'state' in 'useState'
* 'sae' in 'useState'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could have been "sate". 😛

if (strChar === testChar || strChar === toUpperCharCode(testChar)) {
matchedFirstCharacter ||=
prevChar === undefined || // Beginning of word
CharacterCodes.a <= prevChar && prevChar <= CharacterCodes.z && CharacterCodes.A <= strChar && strChar <= CharacterCodes.Z || // camelCase transition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems important, e.g. for languages with accents (I assume languages without lettercase use snake case?).

return false;
}

function toUpperCharCode(charCode: number) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually use toLower, though I don't know whether that was to improve the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Author: Team For Milestone Bug PRs that fix a bug with a specific milestone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve "suggestions show path" for "node_modules"
4 participants