Skip to content

Commit

Permalink
Copy xid string to reduce memory usage in bulk loader (#4287)
Browse files Browse the repository at this point in the history
In bulk loader, we read raw data from input file line by line. We
use substrings of line in various places in the bulk loader. All the
substring references are gone soon enough except the reference
to xid. Reference to xid is stored in the AssignUid map and hence,
the whole line (string) is alive througout the process. We make a
copy of xid now before storing it in the map.
  • Loading branch information
mangalaman93 authored Nov 21, 2019
1 parent e45bbc8 commit 240f8e2
Showing 1 changed file with 10 additions and 1 deletion.
11 changes: 10 additions & 1 deletion dgraph/cmd/bulk/mapper.go
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,16 @@ func (m *mapper) uid(xid string) uint64 {
}

func (m *mapper) lookupUid(xid string) uint64 {
uid, isNew := m.xids.AssignUid(xid)
// We create a copy of xid string here because it is stored in
// the map in AssignUid and going to be around throughout the process.
// We don't want to keep the whole line that we read from file alive.
// xid is a substring of the line that we read from the file and if
// xid is alive, the whole line is going to be alive and won't be GC'd.
// Also, checked that sb goes on the stack whereas sb.String() goes on
// heap. Note that the calls to the strings.Builder.* are inlined.
sb := strings.Builder{}
sb.WriteString(xid)
uid, isNew := m.xids.AssignUid(sb.String())
if !m.opt.StoreXids || !isNew {
return uid
}
Expand Down

0 comments on commit 240f8e2

Please sign in to comment.