HDDS-2467. Allow running Freon validators with limited memory #152

adoroszlai · 2019-11-13T15:32:28Z

What changes were proposed in this pull request?

Freon validators read each item to be validated completely into a byte[] buffer. This allows timing only the read (and buffer allocation), but not the subsequent digest calculation. However, it also means that memory required for running the validators is proportional to key size. I propose to add a command-line flag (-s / --stream) which, when specified, makes Freon calculate the digest while reading the input stream. This changes timing results a bit, since values will include the time required for digest calculation. On the other hand, Freon will be able to validate huge keys with limited memory.
Reduce the memory requirement of the non-stream version by allocating a buffer exactly the size of the key. This adds a bit of overhead in time, since key info needs to be fetched, too. But it eliminates ByteArrayOutputStream, which allocates incrementally larger and larger buffers. The latter can lead to memory requirement twice the actual key size in the worst case (since 2^n > 2^n-1 + 2^n-2 + ...).
Get rid of code duplication between SameKeyReader and OzoneClientKeyValidator.
Allow OzoneClientKeyGenerator to create > 2GB keys.

https://issues.apache.org/jira/browse/HDDS-2467

How was this patch tested?

Created and validated keys using Freon. Verified that even 2.5GB key can be created and validated with --stream. Verified that streaming is forced for such a large key, since it won't fit any array. Verified that smaller keys can be validated both ways.

export HADOOP_OPTS='-Xmx1024M -XX:+HeapDumpOnOutOfMemoryError'
ozone freon ockg -t 1 -F ONE -n 1 -p 2_5GB -s 2684354560
ozone freon ockg -t 1 -F ONE -n 1 -p 256MB -s  268435456
ozone freon ockg -t 1 -F ONE -n 1 -p 128MB -s  134217728
ozone freon ockg -t 1 -F ONE -n 1 -p  64MB -s   67108864
ozone freon ockg -t 1 -F ONE -n 1 -p  10KB -s      10240

export HADOOP_OPTS='-Xmx128M -XX:+HeapDumpOnOutOfMemoryError'
ozone freon ockv -t 1 -n 1 -p  10KB
ozone freon ockv -t 1 -n 1 -p  64MB

export HADOOP_OPTS='-Xmx64M -XX:+HeapDumpOnOutOfMemoryError'
ozone freon ockv -t 1 -n 1 -p  10KB -s
ozone freon ockv -t 1 -n 1 -p  64MB -s
ozone freon ockv -t 1 -n 1 -p 128MB -s
ozone freon ockv -t 1 -n 1 -p 256MB -s
ozone freon ockv -t 1 -n 1 -p 2_5GB -s
ozone freon ockv -t 1 -n 1 -p 2_5GB

ozone freon ockg -t 1 -F ONE -n 100 -p 1KB -s 1024
ozone freon ockv -n 100 -p 1KB

ozone freon ocokr -t 4 -k  '64MB/0' -n 32 -s
ozone freon ocokr -t 8 -k '256MB/0' -n 16 -s

export HADOOP_OPTS='-Xmx1024M -XX:+HeapDumpOnOutOfMemoryError'
ozone freon ocokr -t 2 -k '256MB/0' -n 16

anuengineer · 2019-11-14T17:33:37Z

hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/freon/OzoneClientKeyValidator.java

+
+      // force stream if key is too large for byte[]
+      // (limit taken from ByteArrayOutputStream)
+      if (referenceKeySize > Integer.MAX_VALUE - 8) {


You can force this much sooner, say 1 GB ? or even 100MB. Otherwise, the virtual memory reservation would be huge.
And practically, does it matter?

Any lower limit would be arbitrary. Actual memory requirement depends on the number of threads. One can always add the flag explicitly. I just wanted to preserve existing behavior as a courtesy.

Hope that makes sense :)

anuengineer

+1. LGTM. I have a minor comment, you can fix that in some other JIRA too. if I commit this patch before you get a chance to see my comment.

dineshchitlangia

@adoroszlai thanks for the contribution, @anuengineer thanks for the review.

adoroszlai · 2019-11-20T06:07:06Z

Thanks @anuengineer the review and @dineshchitlangia for review/merge.

…e manager double buffer batch (apache#7188) (apache#152) (cherry picked from commit 5feb9ea) Change-Id: Iee80b0b1aef2a7585b45c60d3826c08a9c926247 Co-authored-by: Swaminathan Balachandran <[email protected]>

HDDS-2467. Allow running Freon validators with limited memory

218d992

anuengineer reviewed Nov 14, 2019

View reviewed changes

anuengineer approved these changes Nov 14, 2019

View reviewed changes

Merge remote-tracking branch 'origin/master' into HDDS-2467

9546a22

dineshchitlangia approved these changes Nov 20, 2019

View reviewed changes

dineshchitlangia merged commit 2fea0af into apache:master Nov 20, 2019

adoroszlai deleted the HDDS-2467 branch November 20, 2019 06:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-2467. Allow running Freon validators with limited memory #152

HDDS-2467. Allow running Freon validators with limited memory #152

Uh oh!

adoroszlai commented Nov 13, 2019

Uh oh!

anuengineer Nov 14, 2019

Uh oh!

adoroszlai Nov 14, 2019

Uh oh!

anuengineer left a comment

Uh oh!

dineshchitlangia left a comment

Uh oh!

adoroszlai commented Nov 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HDDS-2467. Allow running Freon validators with limited memory #152

HDDS-2467. Allow running Freon validators with limited memory #152

Uh oh!

Conversation

adoroszlai commented Nov 13, 2019

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

anuengineer Nov 14, 2019

Choose a reason for hiding this comment

Uh oh!

adoroszlai Nov 14, 2019

Choose a reason for hiding this comment

Uh oh!

anuengineer left a comment

Choose a reason for hiding this comment

Uh oh!

dineshchitlangia left a comment

Choose a reason for hiding this comment

Uh oh!

adoroszlai commented Nov 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants