Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -328,8 +328,9 @@ private ProcessBatchResult processBatch(int batchSize)
}
else {
if (pageProjectWork == null) {
Page loadedPage = projection.getInputChannels().getInputChannels(page);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where it will be accounted for?

Copy link
Copy Markdown
Member Author

@raunaqmorarka raunaqmorarka Sep 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not yet sure that this needs to be accounted for anywhere.
My thought was that if ExpressionProfiler is about detecting time consuming expression evaluation, then the addition of page loading time, which includes orc/parquet read from filesystem and decoding, is probably not right.
But I wasn't certain about it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not yet sure that this needs to be accounted for anywhere.

i would bet that it needs to

My thought was that if ExpressionProfiler is about detecting time consuming expression evaluation

sounds so.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some benchmarking found no change to TPC and small improvement in scan operator benchmarks.
Screenshot 2022-09-15 at 12 32 31 AM
Screenshot 2022-09-15 at 12 32 17 AM

The job of Expression Profiler appears to be to reduce projection batch sizes for expensive expressions so that they don't hog CPU for too long. Any change to projection batch size due to this heuristic does not impact the size of reads and the pages produced by orc/parquet (those are based on different criteria like locality of reads and memory consumption). So it seems that Expression Profiler should ignore time taken to load page from the page source.

expressionProfiler.start();
pageProjectWork = projection.project(session, yieldSignal, projection.getInputChannels().getInputChannels(page), positionsBatch);
pageProjectWork = projection.project(session, yieldSignal, loadedPage, positionsBatch);
expressionProfiler.stop(positionsBatch.size());
}
if (!pageProjectWork.process()) {
Expand Down