Use rampMeter for Executor #5503

ashish-goswami · 2020-05-22T13:55:36Z

Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on
memory used by underlying mutations present in proposals to be applied. However this works
correctly only in normal mode.
In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above
rampMeter doesn't work as expected and this results in lot of memory usage in cases when there
are many proposals to apply.
This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations
in ludicrous mode.

Testing:
I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes.

Benchmarking:
I benchmarked using data generated by below script:

package main

import (
  "bytes"
  "fmt"
  "os"
)

var (
  total int = 100000000
  pred      = 1024
)

func main() {
  f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755)
  if err != nil {
    panic(err)
  }
  defer f.Close()

  totalPerPred := total / pred

  buf := bytes.NewBuffer(nil)
  count := 1
  for i := 0; i < totalPerPred; i++ {
    for j := 0; j < pred; j++ {
      rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count)
      buf.WriteString(rec)
      buf.WriteString("\n")
      count++
      if count%100000 == 0 { 
        buf.WriteTo(f)
        buf.Reset()
      }   
    }   
  }

  buf.WriteTo(f)
  if err := f.Sync(); err != nil {
    panic(err)
  }
  fmt.Println("Done writing to file: ", count)
}

Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates.

Master:
Time to finish live loader: 8m44s
Alpha RAM(RES) usage: 10.9 GB

This PR with maxPendingEdgesSize = 64KB:
Time to finish live loader: 8m48s
Alpha RAM(RES) usage: 9.6 GB

This PR with maxPendingEdgesSize = 64MB:
Time to finish live loader: 8m32s
Alpha RAM(RES) usage: 10.5 GB

This change is

Docs Preview:

manishrjain

Create a separate rampMeter class and use in both.

Reviewed 1 of 1 files at r1.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @vvbalaji-dgraph)

vvbalaji-dgraph · 2020-05-28T13:49:53Z

Thanks for the PR. Do we have data on memory usage in Ludicrous mode with and without this PR?

Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on memory used by underlying mutations present in proposals to be applied. However this works correctly only in normal mode. In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above rampMeter doesn't work as expected and this results in lot of memory usage in cases when there are many proposals to apply. This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations in ludicrous mode. Testing: I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes. Benchmarking: I benchmarked using data generated by below script: package main import ( "bytes" "fmt" "os" ) var ( total int = 100000000 pred = 1024 ) func main() { f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755) if err != nil { panic(err) } defer f.Close() totalPerPred := total / pred buf := bytes.NewBuffer(nil) count := 1 for i := 0; i < totalPerPred; i++ { for j := 0; j < pred; j++ { rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count) buf.WriteString(rec) buf.WriteString("\n") count++ if count%100000 == 0 { buf.WriteTo(f) buf.Reset() } } } buf.WriteTo(f) if err := f.Sync(); err != nil { panic(err) } fmt.Println("Done writing to file: ", count) } Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates. Master: Time to finish live loader: 8m44s Alpha RAM(RES) usage: 10.9 GB This PR with maxPendingEdgesSize = 64KB: Time to finish live loader: 8m48s Alpha RAM(RES) usage: 9.6 GB This PR with maxPendingEdgesSize = 64MB: Time to finish live loader: 8m32s Alpha RAM(RES) usage: 10.5 GB (cherry picked from commit 4735952)

Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on memory used by underlying mutations present in proposals to be applied. However this works correctly only in normal mode. In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above rampMeter doesn't work as expected and this results in lot of memory usage in cases when there are many proposals to apply. This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations in ludicrous mode. Testing: I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes. Benchmarking: I benchmarked using data generated by below script: package main import ( "bytes" "fmt" "os" ) var ( total int = 100000000 pred = 1024 ) func main() { f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755) if err != nil { panic(err) } defer f.Close() totalPerPred := total / pred buf := bytes.NewBuffer(nil) count := 1 for i := 0; i < totalPerPred; i++ { for j := 0; j < pred; j++ { rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count) buf.WriteString(rec) buf.WriteString("\n") count++ if count%100000 == 0 { buf.WriteTo(f) buf.Reset() } } } buf.WriteTo(f) if err := f.Sync(); err != nil { panic(err) } fmt.Println("Done writing to file: ", count) } Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates. Master: Time to finish live loader: 8m44s Alpha RAM(RES) usage: 10.9 GB This PR with maxPendingEdgesSize = 64KB: Time to finish live loader: 8m48s Alpha RAM(RES) usage: 9.6 GB This PR with maxPendingEdgesSize = 64MB: Time to finish live loader: 8m32s Alpha RAM(RES) usage: 10.5 GB

Add pending in excecutor

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode

daa892f

ashish-goswami requested review from manishrjain and vvbalaji-dgraph as code owners May 22, 2020 13:55

Have separate rampMeter for executor

3f9669c

manishrjain approved these changes May 22, 2020

View reviewed changes

Have common rampMeter()

2f19bb5

ashish-goswami changed the title ~~[WIP] Use rampMeter for Executor~~ Use rampMeter for Executor May 26, 2020

Minor comment addition

28763d9

Merge remote-tracking branch 'origin/master' into ashish/exe-ramp

222d094

vvbalaji-dgraph approved these changes Jun 2, 2020

View reviewed changes

ashish-goswami merged commit 4735952 into master Jun 2, 2020

ashish-goswami deleted the ashish/exe-ramp branch June 2, 2020 09:39

ashish-goswami mentioned this pull request Jun 10, 2020

Cherry pick ludicrous mode changes #5629

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use rampMeter for Executor #5503

Use rampMeter for Executor #5503

ashish-goswami commented May 22, 2020 •

edited

Loading

manishrjain left a comment

vvbalaji-dgraph commented May 28, 2020

Use rampMeter for Executor #5503

Use rampMeter for Executor #5503

Conversation

ashish-goswami commented May 22, 2020 • edited Loading

manishrjain left a comment

Choose a reason for hiding this comment

vvbalaji-dgraph commented May 28, 2020

ashish-goswami commented May 22, 2020 •

edited

Loading