-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add --cache_mb and --cache_percentage flag #6271
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got some comments.
Reviewable status: 0 of 11 files reviewed, 12 unresolved discussions (waiting on @manishrjain, @martinmr, @NamanJain8, and @vvbalaji-dgraph)
go.mod, line 9 at r1 (raw file):
contrib.go.opencensus.io/exporter/prometheus v0.1.0 github.com/99designs/gqlgen v0.12.2 github.com/AndreasBriese/bbloom v0.0.0-20190825152654-46b345b51c96 // indirect
Run go mod tidy
and it should remove this.
dgraph/cmd/alpha/run.go, line 209 at r1 (raw file):
// Add cache flags flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in dgraph")
to be used in alpha
dgraph/cmd/alpha/run.go, line 210 at r1 (raw file):
// Add cache flags flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in dgraph") flag.String("cache-percentage", "30:30:20:20", "Cache percentages for various caches (pblocksize:pindexsize:windexsize:postingListsize)")
100 characters.
cache-percentage => cache_percentage.
Add a detailed explanation about the ratios. pblocksize
would make no sense to the user.
dgraph/cmd/alpha/run.go, line 592 at r1 (raw file):
x.AssertTruef(totalCache >= 0, "ERROR: Cache size must be non-negative") if len(cp) != 4 { log.Fatalf("ERROR: cache percentage format is pblocksize:pindexsize:windexsize:postingListsize")
Here as well. Instead of calling it pblockSize
let's call it PstoreBlockCache
and PstoreIndexCache
, etc. We want user to be able to understand what these flags actually mean.
dgraph/cmd/alpha/run.go, line 610 at r1 (raw file):
if percentSum != 100 { log.Fatalf("ERROR: cache percentage does not sum up to 100")
Add the percentages here. w+x+y+z is not equal to 100.
dgraph/cmd/alpha/run.go, line 615 at r1 (raw file):
pIndexSize := (cachePercent[1] * (totalCache << 20)) / 100 wIndexSize := (cachePercent[2] * (totalCache << 20)) / 100 postingListSize := (cachePercent[3] * (totalCache << 20)) / 100
All these are cache sizes. Let's call them cache sizes.
dgraph/cmd/debug/run.go, line 787 at r1 (raw file):
} x.Check(err) posting.Init(db, 0)
Add a comment // Not using posting list cache
dgraph/cmd/zero/run.go, line 112 at r1 (raw file):
// Add cache flags flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in dgraph")
to be used in zero
dgraph/cmd/zero/run.go, line 113 at r1 (raw file):
// Add cache flags flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in dgraph") flag.String("cache-percentage", "75:25", "Cache percentages for various caches (blockcache:indexCache)")
cache_percentage
Zero doesn't have an index cache right now. We should set this to 100:0
right now. We don't need caches always.
dgraph/cmd/zero/run.go, line 230 at r1 (raw file):
} cp := strings.Split(opts.cachePercentage, ":")
You could move the code below to x.go
file (or if we have a util file in x
). The code is duplicated.
dgraph/cmd/zero/run.go, line 242 at r1 (raw file):
x, err := strconv.Atoi(percent) if err != nil { log.Fatalf("ERROR: ERROR: unable to parse cache percentages")
Print the percent
here.
dgraph/cmd/zero/run.go, line 245 at r1 (raw file):
} if x < 0 { log.Fatalf("ERROR: cache percentage cannot be negative")
Print the percent
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got a few comments. Get @jarifibrahim for final look.
Reviewed 2 of 11 files at r1, 10 of 10 files at r2.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @jarifibrahim, @martinmr, @NamanJain8, and @vvbalaji-dgraph)
dgraph/cmd/alpha/run.go, line 210 at r2 (raw file):
flag.Int64("cache_mb", 0, "Total size of cache (in MB) to be used in alpha") flag.String("cache_percentage", "30:30:20:20", `Cache percentages for various caches (PstoreBlockCache:PstoreIndexCache:PstoreIndexCache:PostingListCache)`)
Order it in the biggest to smallest expected caches.
posting list cache : p store block cache : p store index cache : w store block cache : w store index cache
Use commas.
0,65,25,0,10
dgraph/cmd/zero/run.go, line 113 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
cache_percentage
Zero doesn't have an index cache right now. We should set this to
100:0
right now. We don't need caches always.
100 chars. Use commas.
x/x.go, line 1106 at r2 (raw file):
// GetCachePercentages returns the slice of cache percentages given the ":" (colon) separated // cache percentages(integers) string and format of cpString. func GetCachePercentages(cpString string, format string) []int64 {
numExpected int
don't need format.
x/x.go, line 1107 at r2 (raw file):
// cache percentages(integers) string and format of cpString. func GetCachePercentages(cpString string, format string) []int64 { cp := strings.Split(cpString, ":")
comma.
x/x.go, line 1109 at r2 (raw file):
cp := strings.Split(cpString, ":") // Sanity checks if len(cp) != len(strings.Split(format, ":")) {
numExpected int
x/x.go, line 1118 at r2 (raw file):
x, err := strconv.Atoi(percent) if err != nil { log.Fatalf("ERROR: ERROR: unable to parse cache percentage(%s)", percent)
don't need to do fatals here. Return the error and let the caller deal with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 12 files reviewed, 17 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @martinmr, and @vvbalaji-dgraph)
go.mod, line 9 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Run
go mod tidy
and it should remove this.
Done.
dgraph/cmd/alpha/run.go, line 209 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
to be used in alpha
Done.
dgraph/cmd/alpha/run.go, line 210 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
100 characters.
cache-percentage => cache_percentage.
Add a detailed explanation about the ratios.
pblocksize
would make no sense to the user.
Done.
dgraph/cmd/alpha/run.go, line 592 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Here as well. Instead of calling it
pblockSize
let's call itPstoreBlockCache
andPstoreIndexCache
, etc. We want user to be able to understand what these flags actually mean.
Done.
dgraph/cmd/alpha/run.go, line 610 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Add the percentages here. w+x+y+z is not equal to 100.
Done.
dgraph/cmd/alpha/run.go, line 615 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
All these are cache sizes. Let's call them cache sizes.
Done.
dgraph/cmd/alpha/run.go, line 210 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Order it in the biggest to smallest expected caches.
posting list cache : p store block cache : p store index cache : w store block cache : w store index cache
Use commas.
0,65,25,0,10
Done.
dgraph/cmd/debug/run.go, line 787 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Add a comment
// Not using posting list cache
Done.
dgraph/cmd/zero/run.go, line 112 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
to be used in zero
Done.
dgraph/cmd/zero/run.go, line 113 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
100 chars. Use commas.
Done.
dgraph/cmd/zero/run.go, line 230 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
You could move the code below to
x.go
file (or if we have a util file inx
). The code is duplicated.
Done.
dgraph/cmd/zero/run.go, line 242 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Print the
percent
here.
Done.
dgraph/cmd/zero/run.go, line 245 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Print the
percent
here as well.
Done.
x/x.go, line 1106 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
numExpected int
don't need format.
Done.
x/x.go, line 1107 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
comma.
Done.
x/x.go, line 1109 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
numExpected int
Done.
x/x.go, line 1118 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
don't need to do fatals here. Return the error and let the caller deal with this.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 12 files reviewed, 10 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @martinmr, @NamanJain8, and @vvbalaji-dgraph)
dgraph/cmd/bulk/reduce.go, line 133 at r3 (raw file):
opt := badger.DefaultOptions(dir).WithSyncWrites(false). WithTableLoadingMode(bo.MemoryMap).WithValueThreshold(1 << 10 /* 1 KB */). WithLogger(nil).WithBlockCacheSize(1 << 20).
This isn't related to this PR but this cache is too small.
worker/config.go, line 68 at r3 (raw file):
// PIndexCacheSize is the size of index cache for pstore PIndexCacheSize int64 // PBlockCacheSize is the size of block cache for wstore
wblockcachesize
worker/config.go, line 70 at r3 (raw file):
// PBlockCacheSize is the size of block cache for wstore WBlockCacheSize int64 // PIndexCacheSize is the size of index cache for wstore
windexcachesize
x/x.go, line 1109 at r3 (raw file):
// Sanity checks if len(cp) != numExpected { return nil, errors.Errorf("ERROR: expected %d cache percentages, got %d", numExpected, len(cp))
100 chars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 12 files reviewed, 10 unresolved discussions (waiting on @jarifibrahim, @manishrjain, @martinmr, @NamanJain8, and @vvbalaji-dgraph)
worker/config.go, line 68 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
wblockcachesize
Done.
worker/config.go, line 70 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
windexcachesize
Done.
x/x.go, line 1109 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
100 chars
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments. also address comments we discussed on zoom.
LGTM otherwise.
@@ -130,7 +130,7 @@ func (r *reducer) createBadgerInternal(dir string, compression bool) *badger.DB | |||
|
|||
opt := badger.DefaultOptions(dir).WithSyncWrites(false). | |||
WithTableLoadingMode(bo.MemoryMap).WithValueThreshold(1 << 10 /* 1 KB */). | |||
WithLogger(nil).WithMaxCacheSize(1 << 20). | |||
WithLogger(nil).WithBlockCacheSize(1 << 20). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we use the flag configured instead of 1 << 20 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should. That would require adding a flag in dgraph bulk
. @jarifibrahim What do you say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bulk loader shouldn't even have a cache. I'll create a ticket for this. This Pr doesn't need to block on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that those tmpDBs are being read. We need a cache here but it has to be configurable. I have created DGRAPH-2352
This PR closes DGRAPH-2344.
Badger now has 2 separate caches blockCache and indexCache (see hypermodeinc/badger#1476).
This PR adds --cache_mb and --cache_percentage flags for alpha and zero. The
total cache is split among various caches used by zero and alpha based on
percentages defined. Cache size is in MBs and format of caches is as follows:
For alpha:
PostingListCache,PstoreBlockCache,PstoreIndexCache,WstoreBlockCache,WstoreIndexCache
For zero:
blockCache,indexCache
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)