-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use rampMeter for Executor #5503
Merged
Merged
+25
−6
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ashish-goswami
requested review from
manishrjain and
vvbalaji-dgraph
as code owners
May 22, 2020 13:55
manishrjain
approved these changes
May 22, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create a separate rampMeter class and use in both.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @vvbalaji-dgraph)
ashish-goswami
changed the title
[WIP] Use rampMeter for Executor
Use rampMeter for Executor
May 26, 2020
Thanks for the PR. Do we have data on memory usage in Ludicrous mode with and without this PR? |
vvbalaji-dgraph
approved these changes
Jun 2, 2020
ashish-goswami
added a commit
that referenced
this pull request
Jun 10, 2020
Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on memory used by underlying mutations present in proposals to be applied. However this works correctly only in normal mode. In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above rampMeter doesn't work as expected and this results in lot of memory usage in cases when there are many proposals to apply. This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations in ludicrous mode. Testing: I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes. Benchmarking: I benchmarked using data generated by below script: package main import ( "bytes" "fmt" "os" ) var ( total int = 100000000 pred = 1024 ) func main() { f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755) if err != nil { panic(err) } defer f.Close() totalPerPred := total / pred buf := bytes.NewBuffer(nil) count := 1 for i := 0; i < totalPerPred; i++ { for j := 0; j < pred; j++ { rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count) buf.WriteString(rec) buf.WriteString("\n") count++ if count%100000 == 0 { buf.WriteTo(f) buf.Reset() } } } buf.WriteTo(f) if err := f.Sync(); err != nil { panic(err) } fmt.Println("Done writing to file: ", count) } Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates. Master: Time to finish live loader: 8m44s Alpha RAM(RES) usage: 10.9 GB This PR with maxPendingEdgesSize = 64KB: Time to finish live loader: 8m48s Alpha RAM(RES) usage: 9.6 GB This PR with maxPendingEdgesSize = 64MB: Time to finish live loader: 8m32s Alpha RAM(RES) usage: 10.5 GB (cherry picked from commit 4735952)
ashish-goswami
added a commit
that referenced
this pull request
Jun 11, 2020
Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on memory used by underlying mutations present in proposals to be applied. However this works correctly only in normal mode. In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above rampMeter doesn't work as expected and this results in lot of memory usage in cases when there are many proposals to apply. This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations in ludicrous mode. Testing: I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes. Benchmarking: I benchmarked using data generated by below script: package main import ( "bytes" "fmt" "os" ) var ( total int = 100000000 pred = 1024 ) func main() { f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755) if err != nil { panic(err) } defer f.Close() totalPerPred := total / pred buf := bytes.NewBuffer(nil) count := 1 for i := 0; i < totalPerPred; i++ { for j := 0; j < pred; j++ { rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count) buf.WriteString(rec) buf.WriteString("\n") count++ if count%100000 == 0 { buf.WriteTo(f) buf.Reset() } } } buf.WriteTo(f) if err := f.Sync(); err != nil { panic(err) } fmt.Println("Done writing to file: ", count) } Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates. Master: Time to finish live loader: 8m44s Alpha RAM(RES) usage: 10.9 GB This PR with maxPendingEdgesSize = 64KB: Time to finish live loader: 8m48s Alpha RAM(RES) usage: 9.6 GB This PR with maxPendingEdgesSize = 64MB: Time to finish live loader: 8m32s Alpha RAM(RES) usage: 10.5 GB (cherry picked from commit 4735952)
ashish-goswami
added a commit
that referenced
this pull request
Jun 17, 2020
Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on memory used by underlying mutations present in proposals to be applied. However this works correctly only in normal mode. In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above rampMeter doesn't work as expected and this results in lot of memory usage in cases when there are many proposals to apply. This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations in ludicrous mode. Testing: I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes. Benchmarking: I benchmarked using data generated by below script: package main import ( "bytes" "fmt" "os" ) var ( total int = 100000000 pred = 1024 ) func main() { f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755) if err != nil { panic(err) } defer f.Close() totalPerPred := total / pred buf := bytes.NewBuffer(nil) count := 1 for i := 0; i < totalPerPred; i++ { for j := 0; j < pred; j++ { rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count) buf.WriteString(rec) buf.WriteString("\n") count++ if count%100000 == 0 { buf.WriteTo(f) buf.Reset() } } } buf.WriteTo(f) if err := f.Sync(); err != nil { panic(err) } fmt.Println("Done writing to file: ", count) } Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates. Master: Time to finish live loader: 8m44s Alpha RAM(RES) usage: 10.9 GB This PR with maxPendingEdgesSize = 64KB: Time to finish live loader: 8m48s Alpha RAM(RES) usage: 9.6 GB This PR with maxPendingEdgesSize = 64MB: Time to finish live loader: 8m32s Alpha RAM(RES) usage: 10.5 GB (cherry picked from commit 4735952)
dna2github
pushed a commit
to dna2fork/dgraph
that referenced
this pull request
Jul 18, 2020
Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on memory used by underlying mutations present in proposals to be applied. However this works correctly only in normal mode. In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above rampMeter doesn't work as expected and this results in lot of memory usage in cases when there are many proposals to apply. This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations in ludicrous mode. Testing: I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes. Benchmarking: I benchmarked using data generated by below script: package main import ( "bytes" "fmt" "os" ) var ( total int = 100000000 pred = 1024 ) func main() { f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755) if err != nil { panic(err) } defer f.Close() totalPerPred := total / pred buf := bytes.NewBuffer(nil) count := 1 for i := 0; i < totalPerPred; i++ { for j := 0; j < pred; j++ { rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count) buf.WriteString(rec) buf.WriteString("\n") count++ if count%100000 == 0 { buf.WriteTo(f) buf.Reset() } } } buf.WriteTo(f) if err := f.Sync(); err != nil { panic(err) } fmt.Println("Done writing to file: ", count) } Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates. Master: Time to finish live loader: 8m44s Alpha RAM(RES) usage: 10.9 GB This PR with maxPendingEdgesSize = 64KB: Time to finish live loader: 8m48s Alpha RAM(RES) usage: 9.6 GB This PR with maxPendingEdgesSize = 64MB: Time to finish live loader: 8m32s Alpha RAM(RES) usage: 10.5 GB
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently we have
rampMeter
while applying proposals. ThisrampMeter
puts a throttling onmemory used by underlying mutations present in proposals to be applied. However this works
correctly only in normal mode.
In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above
rampMeter doesn't work as expected and this results in lot of memory usage in cases when there
are many proposals to apply.
This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations
in ludicrous mode.
Testing:
I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed
maxPendingEdgesSize
to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes.Benchmarking:
I benchmarked using data generated by below script:
Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates.
Master:
Time to finish live loader: 8m44s
Alpha RAM(RES) usage: 10.9 GB
This PR with maxPendingEdgesSize = 64KB:
Time to finish live loader: 8m48s
Alpha RAM(RES) usage: 9.6 GB
This PR with maxPendingEdgesSize = 64MB:
Time to finish live loader: 8m32s
Alpha RAM(RES) usage: 10.5 GB
This change is
Docs Preview: