Skip to content

Add tracking of memory allocation per operator#14191

Merged
mbasmanova merged 8 commits intoprestodb:masterfrom
mbasmanova:memory-allocation-tracking
Mar 5, 2020
Merged

Add tracking of memory allocation per operator#14191
mbasmanova merged 8 commits intoprestodb:masterfrom
mbasmanova:memory-allocation-tracking

Conversation

@mbasmanova
Copy link
Copy Markdown
Contributor

Here is a snippet of the query JSON fetched from the coordinator featuring new fields: addInput/getOutput/finishAllocation

  }, {
      "stageId" : 1,
      "stageExecutionId" : 0,
      "pipelineId" : 1,
      "operatorId" : 1,
      "planNodeId" : "4",
      "operatorType" : "HashAggregationOperator",
      "totalDrivers" : 408,
      "addInputCalls" : 1,
      "addInputWall" : "23.92ms",
      "addInputCpu" : "22.66ms",
      "addInputAllocation" : "3.64MB",
      "rawInputDataSize" : "0B",
      "rawInputPositions" : 0,
      "inputDataSize" : "115B",
      "inputPositions" : 1,
      "sumSquaredInputPositions" : 1.0,
      "getOutputCalls" : 820,
      "getOutputWall" : "33.97ms",
      "getOutputCpu" : "31.83ms",
      "getOutputAllocation" : "376.24kB",
      "outputDataSize" : "115B",
      "outputPositions" : 1,
      "physicalWrittenDataSize" : "0B",
      "blockedWall" : "0.00ns",
      "finishCalls" : 411,
      "finishWall" : "16.55ms",
      "finishCpu" : "15.69ms",
      "finishAllocation" : "94.47kB",
      "userMemoryReservation" : "0B",
      "revocableMemoryReservation" : "0B",
      "systemMemoryReservation" : "0B",
      "peakUserMemoryReservation" : "387.73kB",
      "peakSystemMemoryReservation" : "0B",
      "peakTotalMemoryReservation" : "387.73kB",
      "spilledDataSize" : "0B"
    }, {
== NO RELEASE NOTE ==

@wenleix
Copy link
Copy Markdown
Contributor

wenleix commented Mar 2, 2020

cc @viczhang861 : Can this deprecate the tagged memory pool? 😃 (discussed in #14117)

@viczhang861
Copy link
Copy Markdown
Contributor

viczhang861 commented Mar 3, 2020

@wenleix Briefly went through change in this PR, it updates allocation when an operator completes. Tagged memory pool is updated whenever some bytes are reserved, which is more up-to-date (that is why tagged memory update is expensive but useful for debugging purpose).

Comment on lines 65 to 66
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This is simpler to understand :)
    !(trackOverallAllocation && THREAD_MX_BEAN_EX == null)
    !(trackOperationAllocation && !trackOverallAllocation)
  • Move second check up

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viczhang861 Changed to

        this.trackOverallAllocation = trackOverallAllocation;
        this.trackOperationAllocation = trackOperationAllocation;
        if (trackOverallAllocation) {
            checkArgument(THREAD_MX_BEAN_EX != null, "tracking allocation is not supported by this JVM");
        }
        else {
            checkArgument(!trackOperationCpuTime, "tracking operation cpu time without tracking overall cpu time is not supported");
        }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better. I restarted a failed test.

@mbasmanova mbasmanova force-pushed the memory-allocation-tracking branch 2 times, most recently from 0bf4e22 to 99bfadc Compare March 3, 2020 15:11
Copy link
Copy Markdown
Contributor

@aweisberg aweisberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good except for a nit regarding the naming "allocation timer enabled" and "per operator allocation timer enabled". The CPU version of this is a timer because it's measuring time.

This is tracking allocation bytes. I would suggest something like "tracking" or "measurement". Those aren't perfect either, but I can't think of anything better.

@viczhang861
Copy link
Copy Markdown
Contributor

Everything looks good except for a nit regarding the naming "allocation timer enabled" and "per operator allocation timer enabled". The CPU version of this is a timer because it's measuring time.

I think timer refers to an interval, not the specific measurement type. For example, OperationTiming also tracks calls.

Copy link
Copy Markdown
Member

@arhimondr arhimondr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe better allocationTracker? The word timer is used for the CPU because the CPU utilization is measured in "time" units (e.g.: milliseconds)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently Presto does not support any other VM's than Oracle JVM. How about just trying to cast ((com.sun.management.ThreadMXBean) THREAD_MX_BEAN) unconditionally inplace. In will throw a meaningful ClassCastException if for some very weird reson some other then com.sun.management.ThreadMXBean implementation is supplied.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tracking operator allocations without ...

@mbasmanova mbasmanova force-pushed the memory-allocation-tracking branch from 99bfadc to 8529410 Compare March 3, 2020 22:19
@mbasmanova mbasmanova force-pushed the memory-allocation-tracking branch from 8529410 to 6321132 Compare March 3, 2020 22:21
@mbasmanova
Copy link
Copy Markdown
Contributor Author

@aweisberg @arhimondr Ariel, Andrii, thank you for reviews. Comments addressed. CC: @rschlussel

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still says timer

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timer here as well

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And these two

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last one!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timer

Copy link
Copy Markdown
Contributor

@aweisberg aweisberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few more timer instances. Otherwise LGTM.

@mbasmanova mbasmanova force-pushed the memory-allocation-tracking branch from 6321132 to 7caf111 Compare March 5, 2020 10:55
@mbasmanova
Copy link
Copy Markdown
Contributor Author

@aweisberg Ariel, thank you for review. I renamed the remaining timers.

@aweisberg
Copy link
Copy Markdown
Contributor

LGTM! 🚢 it.

@mbasmanova mbasmanova merged commit 0e144a6 into prestodb:master Mar 5, 2020
@mbasmanova mbasmanova deleted the memory-allocation-tracking branch March 5, 2020 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants