Detect materialized views with non-deterministic functions as stale#28682
Detect materialized views with non-deterministic functions as stale#28682
Conversation
d7249e7 to
fd3e300
Compare
This is not necessarily a bug. There is some usefulness to having e.g. If we consider every MV with such functions to be stale, there is no point in producing materialization. It should never be used1. So we should rather forbid creation of such MV. cc @martint Footnotes
|
|
that might make sense, I think at a minimum the 2nd commit needs to be there to avoid incorrectly computing an incremental refresh. |
That matches my intuition too, but I'd rather clarify desired semantics first, before making the final call. |
|
Sure, I'll discuss with @martint |
|
@martint Looks like a pre-existing bug too about session-scoped things. I see that current_user (and other session-scoped expressions) are stored as literal SQL text in the MV definition ( replaced with constants at creation time. This creates an inconsistency in what they resolve to depending on the code path:
The refresh path is the wrong one. It probably should also create a view session with the owner's identity (like the inline path does). |
but still doesn't close both, per #28682 (comment) |
|
I extracted the REFRESH and session-scoped issues into #28738, WILL NOT ADDRESS them as part of this PR. This PR is focused on the correctness issue + staleness logic. |
d5ca28c to
5404a6e
Compare
|
The planner changes look good. I haven't reviewed the tests. |
Materialized views using non-deterministic scalar functions like current_timestamp, current_date, or random() were not detected as stale because the refresh only tracks table function dependencies. Add a hasNonDeterministicFunctions flag that propagates through the refresh. When set the MV freshness is reported as UNKNOWN, which causes re-execution after the grace period expires to match the existing behavior for table functions. It covers both current-time AST nodes (CurrentDate, CurrentTime, CurrentTimestamp, LocalTime, LocalTimestamp) and non-deterministic functions via analysis.getResolvedFunctions().
Incremental refresh only scans newly appended rows from the source table. For MVs with non-deterministic functions like current_timestamp in time-based predicates this means old rows that fall outside a shifted time window are never removed. We now force full refresh when non-deterministic functions are detected.
5404a6e to
081e637
Compare
findepi
left a comment
There was a problem hiding this comment.
Skimmed the tests, lgtm.
Solid test coverage, good job! I have not verified every single assertion though.
| // TODO https://github.com/trinodb/trino/issues/28738 session-scoped expressions should resolve using | ||
| // the MV owner's identity during refresh (like the stale inline path does via analyzeView), but currently | ||
| // the refresh path resolves them from the refreshing user's session. | ||
| // When fixed, this should be: VALUES ('user'), ('user') | ||
| assertQuery("SELECT created_by FROM " + mvName, "VALUES ('user'), ('other_user')"); |
There was a problem hiding this comment.
|
hit #23537 before retries |
Description
Materialized views using non-deterministic scalar functions like current_timestamp, current_date, or random() were not detected as stale because the refresh only tracks table function dependencies.
Add a hasNonDeterministicFunctions flag that propagates through the refresh. When set the MV freshness is reported as UNKNOWN, which causes re-execution after the grace period expires to match the existing behavior for table functions.
It covers both current-time AST nodes (CurrentDate, CurrentTime, CurrentTimestamp, LocalTime, LocalTimestamp) and non-deterministic functions via analysis.getResolvedFunctions().
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
(x) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:
Submitted from repo to ensure all tests are run.