Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use rampMeter for Executor #5503

Merged
merged 5 commits into from
Jun 2, 2020
Merged

Use rampMeter for Executor #5503

merged 5 commits into from
Jun 2, 2020

Conversation

ashish-goswami
Copy link
Contributor

@ashish-goswami ashish-goswami commented May 22, 2020

Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on
memory used by underlying mutations present in proposals to be applied. However this works
correctly only in normal mode.
In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above
rampMeter doesn't work as expected and this results in lot of memory usage in cases when there
are many proposals to apply.
This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations
in ludicrous mode.

Testing:
I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes.

Benchmarking:
I benchmarked using data generated by below script:

package main

import (
  "bytes"
  "fmt"
  "os"
)

var (
  total int = 100000000
  pred      = 1024
)

func main() {
  f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755)
  if err != nil {
    panic(err)
  }
  defer f.Close()

  totalPerPred := total / pred

  buf := bytes.NewBuffer(nil)
  count := 1
  for i := 0; i < totalPerPred; i++ {
    for j := 0; j < pred; j++ {
      rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count)
      buf.WriteString(rec)
      buf.WriteString("\n")
      count++
      if count%100000 == 0 { 
        buf.WriteTo(f)
        buf.Reset()
      }   
    }   
  }

  buf.WriteTo(f)
  if err := f.Sync(); err != nil {
    panic(err)
  }
  fmt.Println("Done writing to file: ", count)
}

Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates.

Master:
Time to finish live loader: 8m44s
Alpha RAM(RES) usage: 10.9 GB

This PR with maxPendingEdgesSize = 64KB:
Time to finish live loader: 8m48s
Alpha RAM(RES) usage: 9.6 GB

This PR with maxPendingEdgesSize = 64MB:
Time to finish live loader: 8m32s
Alpha RAM(RES) usage: 10.5 GB


This change is Reviewable

Docs Preview: Dgraph Preview

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a separate rampMeter class and use in both.

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @vvbalaji-dgraph)

@ashish-goswami ashish-goswami changed the title [WIP] Use rampMeter for Executor Use rampMeter for Executor May 26, 2020
@vvbalaji-dgraph
Copy link

Thanks for the PR. Do we have data on memory usage in Ludicrous mode with and without this PR?

@ashish-goswami ashish-goswami merged commit 4735952 into master Jun 2, 2020
@ashish-goswami ashish-goswami deleted the ashish/exe-ramp branch June 2, 2020 09:39
ashish-goswami added a commit that referenced this pull request Jun 10, 2020
Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on
memory used by underlying mutations present in proposals to be applied. However this works
correctly only in normal mode.
In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above
rampMeter doesn't work as expected and this results in lot of memory usage in cases when there
are many proposals to apply.
This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations
in ludicrous mode.

Testing:
I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes.

Benchmarking:
I benchmarked using data generated by below script:

package main

import (
  "bytes"
  "fmt"
  "os"
)

var (
  total int = 100000000
  pred      = 1024
)

func main() {
  f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755)
  if err != nil {
    panic(err)
  }
  defer f.Close()

  totalPerPred := total / pred

  buf := bytes.NewBuffer(nil)
  count := 1
  for i := 0; i < totalPerPred; i++ {
    for j := 0; j < pred; j++ {
      rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count)
      buf.WriteString(rec)
      buf.WriteString("\n")
      count++
      if count%100000 == 0 { 
        buf.WriteTo(f)
        buf.Reset()
      }   
    }   
  }

  buf.WriteTo(f)
  if err := f.Sync(); err != nil {
    panic(err)
  }
  fmt.Println("Done writing to file: ", count)
}
Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates.

Master:
Time to finish live loader: 8m44s
Alpha RAM(RES) usage: 10.9 GB

This PR with maxPendingEdgesSize = 64KB:
Time to finish live loader: 8m48s
Alpha RAM(RES) usage: 9.6 GB

This PR with maxPendingEdgesSize = 64MB:
Time to finish live loader: 8m32s
Alpha RAM(RES) usage: 10.5 GB

(cherry picked from commit 4735952)
ashish-goswami added a commit that referenced this pull request Jun 11, 2020
Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on
memory used by underlying mutations present in proposals to be applied. However this works
correctly only in normal mode.
In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above
rampMeter doesn't work as expected and this results in lot of memory usage in cases when there
are many proposals to apply.
This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations
in ludicrous mode.

Testing:
I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes.

Benchmarking:
I benchmarked using data generated by below script:

package main

import (
  "bytes"
  "fmt"
  "os"
)

var (
  total int = 100000000
  pred      = 1024
)

func main() {
  f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755)
  if err != nil {
    panic(err)
  }
  defer f.Close()

  totalPerPred := total / pred

  buf := bytes.NewBuffer(nil)
  count := 1
  for i := 0; i < totalPerPred; i++ {
    for j := 0; j < pred; j++ {
      rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count)
      buf.WriteString(rec)
      buf.WriteString("\n")
      count++
      if count%100000 == 0 { 
        buf.WriteTo(f)
        buf.Reset()
      }   
    }   
  }

  buf.WriteTo(f)
  if err := f.Sync(); err != nil {
    panic(err)
  }
  fmt.Println("Done writing to file: ", count)
}
Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates.

Master:
Time to finish live loader: 8m44s
Alpha RAM(RES) usage: 10.9 GB

This PR with maxPendingEdgesSize = 64KB:
Time to finish live loader: 8m48s
Alpha RAM(RES) usage: 9.6 GB

This PR with maxPendingEdgesSize = 64MB:
Time to finish live loader: 8m32s
Alpha RAM(RES) usage: 10.5 GB

(cherry picked from commit 4735952)
ashish-goswami added a commit that referenced this pull request Jun 17, 2020
Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on
memory used by underlying mutations present in proposals to be applied. However this works
correctly only in normal mode.
In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above
rampMeter doesn't work as expected and this results in lot of memory usage in cases when there
are many proposals to apply.
This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations
in ludicrous mode.

Testing:
I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes.

Benchmarking:
I benchmarked using data generated by below script:

package main

import (
  "bytes"
  "fmt"
  "os"
)

var (
  total int = 100000000
  pred      = 1024
)

func main() {
  f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755)
  if err != nil {
    panic(err)
  }
  defer f.Close()

  totalPerPred := total / pred

  buf := bytes.NewBuffer(nil)
  count := 1
  for i := 0; i < totalPerPred; i++ {
    for j := 0; j < pred; j++ {
      rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count)
      buf.WriteString(rec)
      buf.WriteString("\n")
      count++
      if count%100000 == 0 { 
        buf.WriteTo(f)
        buf.Reset()
      }   
    }   
  }

  buf.WriteTo(f)
  if err := f.Sync(); err != nil {
    panic(err)
  }
  fmt.Println("Done writing to file: ", count)
}
Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates.

Master:
Time to finish live loader: 8m44s
Alpha RAM(RES) usage: 10.9 GB

This PR with maxPendingEdgesSize = 64KB:
Time to finish live loader: 8m48s
Alpha RAM(RES) usage: 9.6 GB

This PR with maxPendingEdgesSize = 64MB:
Time to finish live loader: 8m32s
Alpha RAM(RES) usage: 10.5 GB

(cherry picked from commit 4735952)
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 18, 2020
Currently we have rampMeter while applying proposals. This rampMeter puts a throttling on
memory used by underlying mutations present in proposals to be applied. However this works
correctly only in normal mode.
In ludicrous mode, we apply mutations for these proposals asynchronously. Hence above
rampMeter doesn't work as expected and this results in lot of memory usage in cases when there
are many proposals to apply.
This PR introduces a rampMeter for executor as well. Executor is responsible for applying mutations
in ludicrous mode.

Testing:
I tested this PR by running live loader on 21M dataset in ludicrous mode. On this PR live loader completes in ~5-6 minutes. However when I changed maxPendingEdgesSize to 64KB(instead of 64MB currently) in this PR, live loader completes in ~10-11 minutes.

Benchmarking:
I benchmarked using data generated by below script:

package main

import (
  "bytes"
  "fmt"
  "os"
)

var (
  total int = 100000000
  pred      = 1024
)

func main() {
  f, err := os.OpenFile("test.rdf", os.O_CREATE|os.O_RDWR, 0755)
  if err != nil {
    panic(err)
  }
  defer f.Close()

  totalPerPred := total / pred

  buf := bytes.NewBuffer(nil)
  count := 1
  for i := 0; i < totalPerPred; i++ {
    for j := 0; j < pred; j++ {
      rec := fmt.Sprintf(`_:record_%d <pred_%d> "value_%d" .`, count, j, count)
      buf.WriteString(rec)
      buf.WriteString("\n")
      count++
      if count%100000 == 0 { 
        buf.WriteTo(f)
        buf.Reset()
      }   
    }   
  }

  buf.WriteTo(f)
  if err := f.Sync(); err != nil {
    panic(err)
  }
  fmt.Println("Done writing to file: ", count)
}
Above scripts generates ~100M records with 97K records/predicate and total of 1024 predicates.

Master:
Time to finish live loader: 8m44s
Alpha RAM(RES) usage: 10.9 GB

This PR with maxPendingEdgesSize = 64KB:
Time to finish live loader: 8m48s
Alpha RAM(RES) usage: 9.6 GB

This PR with maxPendingEdgesSize = 64MB:
Time to finish live loader: 8m32s
Alpha RAM(RES) usage: 10.5 GB
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants