Skip to content

Commit 21d2256

Browse files
committed
gopls/internal/lsp/cache: make type-checking incremental
In this CL type-checked packages are made entirely independent of each other, and package export data and indexes are stored in a file cache. As a result, gopls uses significantly less memory, and (with a warm cache) starts significantly faster. Other benchmarks have regressed slightly due to the additional I/O and export data loading, but not significantly so, and we have some ideas for how to further narrow or even close the performance gap. In the benchmarks below, based on the x/tools repository, we can see that in-use memory was reduced by 88%, and startup time with a warm cache by 65% (this is the best case where nothing has changed). Other benchmarks regressed by 10-50%, much of which can be addressed by improvements to the objectpath package (golang/go#51017), and by making package data serialization asynchronous to type-checking. Notably, we observe larger regressions in implementations, references, and rename because the index implementations (by Alan Donovan) preceded this change to type-checking, and so these benchmark statistics compare in-memory index performance to on-disk index performance. Again, we can optimize these if necessary by keeping certain index information in memory, or by decoding more selectively. name old in_use_bytes new in_use_bytes delta InitialWorkspaceLoad/tools-12 432M ± 2% 50M ± 2% -88.54% (p=0.000 n=10+10) name old time/op new time/op delta StructCompletion/tools-12 27.2ms ± 5% 31.8ms ± 9% +16.99% (p=0.000 n=9+9) ImportCompletion/tools-12 2.07ms ± 8% 2.21ms ± 6% +6.64% (p=0.004 n=9+9) SliceCompletion/tools-12 29.0ms ± 5% 32.7ms ± 5% +12.78% (p=0.000 n=10+9) FuncDeepCompletion/tools-12 39.6ms ± 6% 39.3ms ± 3% ~ (p=0.853 n=10+10) CompletionFollowingEdit/tools-12 72.7ms ± 7% 108.1ms ± 7% +48.59% (p=0.000 n=9+9) Definition/tools-12 525µs ± 6% 601µs ± 2% +14.33% (p=0.000 n=9+10) DidChange/tools-12 6.17ms ± 7% 6.77ms ± 2% +9.64% (p=0.000 n=10+10) Hover/tools-12 2.11ms ± 5% 2.61ms ± 3% +23.87% (p=0.000 n=10+10) Implementations/tools-12 4.04ms ± 3% 60.19ms ± 3% +1389.77% (p=0.000 n=9+10) InitialWorkspaceLoad/tools-12 3.84s ± 4% 1.33s ± 2% -65.47% (p=0.000 n=10+9) References/tools-12 9.72ms ± 6% 24.28ms ± 6% +149.83% (p=0.000 n=10+10) Rename/tools-12 121ms ± 8% 168ms ±12% +38.92% (p=0.000 n=10+10) WorkspaceSymbols/tools-12 14.4ms ± 6% 15.6ms ± 3% +8.76% (p=0.000 n=9+10) This CL is one step closer to the end* of a long journey to reduce memory usage and statefulness in gopls, so that it can be more performant and reliable. Specifically, this CL implements a new type-checking pass that loads and stores export data, cross references, serialized diagnostics, and method set indexes in the file system. Concurrent type-checking passes may share in-progress work, but after type-checking only active packages are kept in memory. Consequently, there can be no global relationship between type-checked packages. The work to break any dependence on global relationships was done over a long time leading up to this CL. In order to approach the previous type-checking performance, the following new optimizations are made: - the global FileSet is completely removed: repeatedly importing from export data resulted in a tremendous amount of unnecessary token.File information, and so FileSets had to be scoped to packages - files are parsed as a batch and stored in the LRU cache implemented in the preceding CL - type-checking is also turned into a batch process, so that overlapping nodes in the package graph may be shared during large type-checking operations such as the initial workspace load This new execution model enables several simplifications: - We no longer need to trim the AST before type-checking: TypeCheckMode and ParseExported are gone. - We no longer need to do careful bookkeeping around parsed files: all parsing uses the LRU parse cache. - It is no longer necessary to estimate cache heap usage in debug information. There is still much more to do. This new model for gopls's execution requires significant testing and experimentation. There may be new bugs in the complicated new algorithms that enable this change, or bugs related to the new reliance on export data (this may be the first time export data for packages with type errors is significantly exercised). There may be new environments where the new execution model does not have the same beneficial effect. (On the other hand, there may be some where it has an even more beneficial effect, such as resource limited environments like dev containers.) There are also a lot of new opportunities for optimization now that we are no longer tied to a rigid structure of in-memory data. *Furthermore, the following planned work is simply not done yet: - Implement precise pruning based on "deep" hash of imports. - Rewrite unimported completion, now that we no longer have cached import paths. For golang/go#57987 Change-Id: Iedfc16656f79e314be448b892b710b9e63f72551 Reviewed-on: https://go-review.googlesource.com/c/tools/+/466975 Run-TryBot: Robert Findley <[email protected]> TryBot-Result: Gopher Robot <[email protected]> gopls-CI: kokoro <[email protected]> Reviewed-by: Alan Donovan <[email protected]>
1 parent ae05609 commit 21d2256

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+1218
-1652
lines changed

gopls/internal/lsp/cache/analysis.go

+26-7
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ import (
1111
"context"
1212
"crypto/sha256"
1313
"encoding/gob"
14+
"encoding/json"
1415
"errors"
1516
"fmt"
1617
"go/ast"
@@ -1155,14 +1156,18 @@ func mustDecode(data []byte, ptr interface{}) {
11551156
}
11561157
}
11571158

1158-
// -- data types for serialization of analysis.Diagnostic --
1159+
// -- data types for serialization of analysis.Diagnostic and source.Diagnostic --
11591160

11601161
type gobDiagnostic struct {
11611162
Location protocol.Location
1162-
Category string
1163+
Severity protocol.DiagnosticSeverity
1164+
Code string
1165+
CodeHref string
1166+
Source string
11631167
Message string
11641168
SuggestedFixes []gobSuggestedFix
11651169
Related []gobRelatedInformation
1170+
Tags []protocol.DiagnosticTag
11661171
}
11671172

11681173
type gobRelatedInformation struct {
@@ -1171,8 +1176,16 @@ type gobRelatedInformation struct {
11711176
}
11721177

11731178
type gobSuggestedFix struct {
1174-
Message string
1175-
TextEdits []gobTextEdit
1179+
Message string
1180+
TextEdits []gobTextEdit
1181+
Command *gobCommand
1182+
ActionKind protocol.CodeActionKind
1183+
}
1184+
1185+
type gobCommand struct {
1186+
Title string
1187+
Command string
1188+
Arguments []json.RawMessage
11761189
}
11771190

11781191
type gobTextEdit struct {
@@ -1218,11 +1231,17 @@ func toGobDiagnostic(posToLocation func(start, end token.Pos) (protocol.Location
12181231
if err != nil {
12191232
return gobDiagnostic{}, err
12201233
}
1234+
12211235
return gobDiagnostic{
1222-
Location: loc,
1223-
Category: diag.Category,
1236+
Location: loc,
1237+
// Severity for analysis diagnostics is dynamic, based on user
1238+
// configuration per analyzer.
1239+
// Code and CodeHref are unset for Analysis diagnostics,
1240+
// TODO(rfindley): set Code fields if/when golang/go#57906 is accepted.
1241+
Source: diag.Category,
12241242
Message: diag.Message,
1225-
Related: related,
12261243
SuggestedFixes: fixes,
1244+
Related: related,
1245+
// Analysis diagnostics do not contain tags.
12271246
}, nil
12281247
}

gopls/internal/lsp/cache/cache.go

+2-111
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,7 @@ package cache
66

77
import (
88
"context"
9-
"fmt"
10-
"go/ast"
11-
"go/token"
12-
"go/types"
13-
"html/template"
149
"reflect"
15-
"sort"
1610
"strconv"
1711
"sync/atomic"
1812

@@ -29,22 +23,15 @@ import (
2923
// Both the fset and store may be nil, but if store is non-nil so must be fset
3024
// (and they must always be used together), otherwise it may be possible to get
3125
// cached data referencing token.Pos values not mapped by the FileSet.
32-
func New(fset *token.FileSet, store *memoize.Store) *Cache {
26+
func New(store *memoize.Store) *Cache {
3327
index := atomic.AddInt64(&cacheIndex, 1)
3428

35-
if store != nil && fset == nil {
36-
panic("non-nil store with nil fset")
37-
}
38-
if fset == nil {
39-
fset = token.NewFileSet()
40-
}
4129
if store == nil {
4230
store = &memoize.Store{}
4331
}
4432

4533
c := &Cache{
4634
id: strconv.FormatInt(index, 10),
47-
fset: fset,
4835
store: store,
4936
memoizedFS: &memoizedFS{filesByID: map[robustio.FileID][]*DiskFile{}},
5037
}
@@ -56,8 +43,7 @@ func New(fset *token.FileSet, store *memoize.Store) *Cache {
5643
// TODO(rfindley): once fset and store need not be bundled together, the Cache
5744
// type can be eliminated.
5845
type Cache struct {
59-
id string
60-
fset *token.FileSet
46+
id string
6147

6248
store *memoize.Store
6349

@@ -90,98 +76,3 @@ var cacheIndex, sessionIndex, viewIndex int64
9076

9177
func (c *Cache) ID() string { return c.id }
9278
func (c *Cache) MemStats() map[reflect.Type]int { return c.store.Stats() }
93-
94-
type packageStat struct {
95-
id PackageID
96-
mode source.ParseMode
97-
file int64
98-
ast int64
99-
types int64
100-
typesInfo int64
101-
total int64
102-
}
103-
104-
func (c *Cache) PackageStats(withNames bool) template.HTML {
105-
var packageStats []packageStat
106-
c.store.DebugOnlyIterate(func(k, v interface{}) {
107-
switch k.(type) {
108-
case packageHandleKey:
109-
v := v.(typeCheckResult)
110-
if v.pkg == nil {
111-
break
112-
}
113-
typsCost := typesCost(v.pkg.types.Scope())
114-
typInfoCost := typesInfoCost(v.pkg.typesInfo)
115-
stat := packageStat{
116-
id: v.pkg.id,
117-
mode: v.pkg.mode,
118-
types: typsCost,
119-
typesInfo: typInfoCost,
120-
}
121-
for _, f := range v.pkg.compiledGoFiles {
122-
stat.file += int64(len(f.Src))
123-
stat.ast += astCost(f.File)
124-
}
125-
stat.total = stat.file + stat.ast + stat.types + stat.typesInfo
126-
packageStats = append(packageStats, stat)
127-
}
128-
})
129-
var totalCost int64
130-
for _, stat := range packageStats {
131-
totalCost += stat.total
132-
}
133-
sort.Slice(packageStats, func(i, j int) bool {
134-
return packageStats[i].total > packageStats[j].total
135-
})
136-
html := "<table><thead><td>Name</td><td>total = file + ast + types + types info</td></thead>\n"
137-
human := func(n int64) string {
138-
return fmt.Sprintf("%.2f", float64(n)/(1024*1024))
139-
}
140-
var printedCost int64
141-
for _, stat := range packageStats {
142-
name := stat.id
143-
if !withNames {
144-
name = "-"
145-
}
146-
html += fmt.Sprintf("<tr><td>%v (%v)</td><td>%v = %v + %v + %v + %v</td></tr>\n", name, stat.mode,
147-
human(stat.total), human(stat.file), human(stat.ast), human(stat.types), human(stat.typesInfo))
148-
printedCost += stat.total
149-
if float64(printedCost) > float64(totalCost)*.9 {
150-
break
151-
}
152-
}
153-
html += "</table>\n"
154-
return template.HTML(html)
155-
}
156-
157-
func astCost(f *ast.File) int64 {
158-
if f == nil {
159-
return 0
160-
}
161-
var count int64
162-
ast.Inspect(f, func(_ ast.Node) bool {
163-
count += 32 // nodes are pretty small.
164-
return true
165-
})
166-
return count
167-
}
168-
169-
func typesCost(scope *types.Scope) int64 {
170-
cost := 64 + int64(scope.Len())*128 // types.object looks pretty big
171-
for i := 0; i < scope.NumChildren(); i++ {
172-
cost += typesCost(scope.Child(i))
173-
}
174-
return cost
175-
}
176-
177-
func typesInfoCost(info *types.Info) int64 {
178-
// Most of these refer to existing objects, with the exception of InitOrder, Selections, and Types.
179-
cost := 24*len(info.Defs) +
180-
32*len(info.Implicits) +
181-
256*len(info.InitOrder) + // these are big, but there aren't many of them.
182-
32*len(info.Scopes) +
183-
128*len(info.Selections) + // wild guess
184-
128*len(info.Types) + // wild guess
185-
32*len(info.Uses)
186-
return int64(cost)
187-
}

0 commit comments

Comments
 (0)