Skip to content

Verifier: Improve determinism analysis and checksum query recording#13758

Merged
mbasmanova merged 9 commits intoprestodb:masterfrom
caithagoras:s5
Dec 4, 2019
Merged

Verifier: Improve determinism analysis and checksum query recording#13758
mbasmanova merged 9 commits intoprestodb:masterfrom
caithagoras:s5

Conversation

@caithagoras
Copy link
Contributor

@caithagoras caithagoras commented Nov 26, 2019

== RELEASE NOTES ==

Verifier Changes
-----------------
* Fix an issue where checksum query text and ID are not recorded if the checksum query fails.
* Add new columns ``control_session_properties`` and `test_session_properties` to ``verifier_queries``, and remove column ``session_properties_json``. The value of the removed column can be copied to the two new columns for the schema change.
* Add details of determinism analysis runs to the output.
* Add configuration property ``max-determinism-analysis-runs`` to control maximum number of determinism analysis runs in case of column mismatch.
* Add configuration property ``run-teardown-for-determinism-analysis`` to allow disabling teardown for determinism analysis runs.

@caithagoras caithagoras force-pushed the s5 branch 21 times, most recently from 5ba3880 to e669988 Compare December 2, 2019 06:40
@caithagoras caithagoras changed the title [WIP] Record details about determinism analysis Record details about determinism analysis Dec 2, 2019
@caithagoras caithagoras changed the title Record details about determinism analysis Verifier Improvemetns Dec 2, 2019
@caithagoras caithagoras changed the title Verifier Improvemetns Verifier Improvements Dec 2, 2019
@caithagoras caithagoras changed the title Verifier Improvements Improve determinism analysis and checksum query recording in Verifier Dec 2, 2019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caithagoras What is Limit here stands for? Why not just QueryDeterminismAnalysis or better yet QueryDeterminismAnalysisResult?

Copy link
Contributor Author

@caithagoras caithagoras Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do an additional determinism check for queries with LIMIT clause on the outer layer if all previous determinism analysis runs say DETERMINISTIC, and that's the LimitQueryDeterminismAnalysis.

Select/Insert/CreateTableAsSelect queries with LIMIT clause at the same level of the select query are eligible, expect for queries with ORDER BY clause at the same level.

  • SELECT ... LIMIT ...
  • SELECT ... UNION ALL SELECT ... LIMIT ...
  • INSERT INTO ... SELECT ... LIMIT ...
  • CREATE TABLE ... AS SELECT ... LIMIT ...

What we do:

  • We run a count(1) query with the LIMIT clause removed to get the total row count.
  • If total row count > limit count, treat as non-deterministic.

Copy link
Contributor

@mbasmanova mbasmanova Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caithagoras Thanks for explaining? Would you add this to the docs if it is not there already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add this to a separate PR: #12686

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caithagoras That PR seems to be stuck for a few months now. What's blocking it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to address the comments. There is no blocker.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caithagoras Export checksum query and query id even if the query fails looks good.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caithagoras Export determinism analysis details looks good.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caithagoras Make determinism analysis run count configurable looks good.

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caithagoras Fix variable name in VerifierConfig looks good

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caithagoras Add an option to disable teardown for determinism analysis runs looks good too.

Overall, very clean PR. Thanks for breaking things up into small commits.

@mbasmanova mbasmanova changed the title Improve determinism analysis and checksum query recording in Verifier Verifier: Improve determinism analysis and checksum query recording Dec 4, 2019
@mbasmanova mbasmanova merged commit d838c9b into prestodb:master Dec 4, 2019
@aweisberg aweisberg mentioned this pull request Jan 17, 2020
7 tasks
@caithagoras caithagoras mentioned this pull request Jan 22, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants