Skip to content

Fix raw input stats for ScanFilterAndProject#13873

Merged
mbasmanova merged 2 commits intoprestodb:masterfrom
pettyjamesm:lazy-block-stats-fix
Jan 7, 2020
Merged

Fix raw input stats for ScanFilterAndProject#13873
mbasmanova merged 2 commits intoprestodb:masterfrom
pettyjamesm:lazy-block-stats-fix

Conversation

@pettyjamesm
Copy link
Contributor

Previously, LazyBlocks loaded from a PageSource would update the
processed input bytes upon being loaded, but would not recompute the raw input
bytes. The next call to getOutput would update the raw input stats, but if that never occurs then the reads were left uncounted.

== RELEASE NOTES ==
General Changes
* Fix ScanFilterAndProjectOperator raw input bytes accounting for LazyBlocks

@mbasmanova mbasmanova self-assigned this Dec 18, 2019
@mbasmanova mbasmanova requested a review from a team December 18, 2019 16:43
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pettyjamesm James, thanks for fixing this issue.

nit: comment title shouldn't capitalize all words: Fix raw input stats for ScanFilterAndProject

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why this call is needed. If LazyBlocks are not loaded, the data is not read, hence no stats to update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a last chance update that shouldn't be necessary if everything else is working as expected, but attempts to harden the bytes read accounting against future refactors. I can remove it if you'd prefer, but I would note that the previous omission of accounting wouldn't have yielded incorrect results if this fallback logic in finish were there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pettyjamesm I would prefer to remove this call (as long as it is not necessary).

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: commit title shouldn't capitalize all words: Refactor stats recording in TableScanOperator

@mbasmanova
Copy link
Contributor

Looks good % one question about calling recordPageSourceRawInputStats from finish.

@mbasmanova mbasmanova added the aria Presto Aria performance improvements label Dec 20, 2019
@mbasmanova mbasmanova requested a review from a team December 20, 2019 02:18
@pettyjamesm pettyjamesm changed the title Fix LazyBlock Raw Input Bytes for ScanFilterAndProjectOperator Fix raw input stats for ScanFilterAndProject Dec 27, 2019
@pettyjamesm pettyjamesm force-pushed the lazy-block-stats-fix branch 2 times, most recently from 0783e90 to d8312b0 Compare December 27, 2019 19:22
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pettyjamesm Looks good % unnecessary recordPageSourceRawInputStats call in ScanFilterAndProject.

@mbasmanova mbasmanova requested a review from a team January 6, 2020 15:52
Previously, LazyBlocks loaded from a PageSource would update the
processed input stat, but would not inherently trigger the raw input
bytes to be recomputed. This wasn't necessarily a problem, but could
result in raw input data being left uncounted until the next call to
getOutput (which might never occur).
@pettyjamesm pettyjamesm force-pushed the lazy-block-stats-fix branch from d8312b0 to a49865b Compare January 6, 2020 21:59
@pettyjamesm
Copy link
Contributor Author

Removed

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pettyjamesm Thanks.

Reorganizes TableScanOperator stats collection to match the changes
made to ScanFilterAndProjectOperator.
@pettyjamesm pettyjamesm force-pushed the lazy-block-stats-fix branch from a49865b to 1fa774b Compare January 6, 2020 23:22
@pettyjamesm
Copy link
Contributor Author

Different test failures across two source identical builds are in different modules and seem spurious/unrelated.

@mbasmanova mbasmanova merged commit 072f645 into prestodb:master Jan 7, 2020
@pettyjamesm pettyjamesm deleted the lazy-block-stats-fix branch January 16, 2020 17:59
@aweisberg aweisberg mentioned this pull request Jan 17, 2020
7 tasks
@caithagoras caithagoras mentioned this pull request Jan 22, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aria Presto Aria performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants