Robust join implementation #307

frankmcsherry · 2021-02-19T01:02:38Z

This PR changes the implementation of join to be more robust in the presence of pre-compacted arrangements.

In particular, https://github.com/MaterializeInc/database-issues/issues/1780 was at its core about join receiving pre-compacted arrangements, misunderstanding the physical compaction frontier, mis-SETTING the physical compaction frontier, and then believing it was getting results that it was not correctly receiving. A giant tire fire, really.

The problem was that it was not carefully tracking the relationship between received batches and the physical compaction frontier. This PR attempts to fix that by being substantially more clear about the invariants that should be checked at operator construction time, and then maintained as the operator executes. They are not very complicated, and boil down to:

for each trace, the physical compaction frontier should be less or equal to the completed trace frontier.

That is, we shouldn't physically compact past the timestamps we have completed, which makes some sense because such a handle to a trace is not especially useful (you cannot subset out any of the existing data, under the current rules for extracting out subsets of data). Join in particular wants to be able to subset out from a trace the data that it has received in batches, and if it cannot reliably do that .. well the implementation we have now doesn't work, nor will this new one (just, it will more aggressively not work).

cc @ruchirK

frankmcsherry · 2021-02-19T02:04:59Z

@ryzhyk if you happen to have a test suite hanging around to point at this branch I'd be interested! We'll hammer on it at Materialize too, but you do different things than we do (e.g. multi-dimensional timestamps in your joins). The only intended change is that on pre-compacted arrangements it should be less wrong, and it may be that this hasn't been an issue for you, but we'll probably aim to land this or something like it.

ryzhyk · 2021-02-19T02:44:47Z

@ryzhyk if you happen to have a test suite hanging around to point at this branch I'd be interested! We'll hammer on it at Materialize too, but you do different things than we do (e.g. multi-dimensional timestamps in your joins). The only intended change is that on pre-compacted arrangements it should be less wrong, and it may be that this hasn't been an issue for you, but we'll probably aim to land this or something like it.

Sure, I'll get on it. Is it possible to characterize the scenarios where the bug shows up and how it manifests outside of DD?

frankmcsherry · 2021-02-19T02:59:46Z

Yeah, the main thing is that if you import an Arranged into a dataflow, where the underlying trace has undergone physical compaction, then .. under circumstances that now seem easy to reproduce (in Materialize) the compacted representation of the trace confuses join who produces double the output it should have (it takes in inputs A and B, and produces A x B in response to A and A x B in response to B).

It's a bit tricky to isolate, as it seems to reliably happen in the second execution of a Materialize query, and I haven't exactly figured out how to make it happen, but I can detail what does happen once it gets in that state.

frankmcsherry · 2021-02-19T03:04:14Z

In maybe more detail: join starts up with the compacted trace, and in the first execution finds no inputs, but nonetheless learns that [0, x) is empty. It "accepts" this range and advances its internal bounds, and indicates to the trace that it will only need to be able to get access to the trace from x onward. However, the trace is perhaps compacted past x at this point. That should be an error there (and in this PR is now a debug_assert!) as you should not be able to roll back the physical compaction frontier. Instead, it overwrote the compaction frontier (it succeeded in rolling it back) and the next time around the loop asked for data up to that point, and got all the data instead (as the confused trace had more data, and yet saw that the request lined up with its new physical compaction frontier).

It's hard to explain without wincing at the sketchiness of it all. The new code is meant to be a lot more clear about stating, checking, and maintaining invariants.

ryzhyk · 2021-02-19T04:29:09Z

Our tests are passing, but then they did not catch the bug in the first place...

frankmcsherry · 2021-02-20T01:35:44Z

This has passed some amount of randomized stress testing at Materialize, so once we get a review on it I think we can land it (not erroring is good, but it should also land in a form that is more easy to understand than previously).

ruchirK

This seems like it makes sense! I don't fully understand everything this code is doing and I didn't take a close look to try to establish that the old / new versions of the code do the exact same stuff but I think the invariants we are trying to establish make sense. I left a few comments around places that seemed confusing.

src/operators/join.rs

ruchirK · 2021-02-21T15:48:46Z

src/operators/join.rs

-                        trace1.advance_upper(acknowledged1);
-                        trace1.set_physical_compaction(acknowledged1.borrow());
-                    }
+                    trace1.advance_upper(&mut acknowledged1);


I don't quite understand how set_logical_compaction, advance_upper and set_physical_compaction interact (specifically the first two seem very overlapping?). Haven't read the docs on these so this is a bit of a lazy question - this could be a nice place for a few more comments

I'll add some more comments!

frankmcsherry · 2021-02-21T17:03:03Z

I've restructured the code a bit, with substantially more comments, which I think resolve all the issues except perhaps the last comment about "how all these things interact". The comments are meant to address that, but I'd be up for any feedback on whether they do or do not do this.

ruchirK · 2021-02-21T19:24:30Z

src/operators/join.rs

+                        // Allow `trace1` to physically compact up to the upper bound of batches we
+                        // have received in its input (`input1`). We will not require a cursor that
+                        // is not beyond this bound.
+                        trace1.set_physical_compaction(acknowledged1.borrow());


does acknowledged1 need to be <= input2.frontier() / is it always because of some other invariant?

No, they are generally unrelated.

robust join implementation

6512b8b

frankmcsherry force-pushed the robustify_join branch from 65c24a1 to 6512b8b Compare February 21, 2021 02:47

ruchirK approved these changes Feb 21, 2021

View reviewed changes

improve organization and comments

a57546f

ruchirK reviewed Feb 21, 2021

View reviewed changes

frankmcsherry merged commit e229d54 into master Feb 23, 2021

This was referenced Oct 29, 2024

chore: release #532

Closed

chore: release #534

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robust join implementation #307

Robust join implementation #307

frankmcsherry commented Feb 19, 2021 •

edited

Loading

frankmcsherry commented Feb 19, 2021

ryzhyk commented Feb 19, 2021

frankmcsherry commented Feb 19, 2021

frankmcsherry commented Feb 19, 2021

ryzhyk commented Feb 19, 2021

frankmcsherry commented Feb 20, 2021

ruchirK left a comment

ruchirK Feb 21, 2021

frankmcsherry Feb 21, 2021

frankmcsherry commented Feb 21, 2021

ruchirK Feb 21, 2021

frankmcsherry Feb 21, 2021

Robust join implementation #307

Robust join implementation #307

Conversation

frankmcsherry commented Feb 19, 2021 • edited Loading

frankmcsherry commented Feb 19, 2021

ryzhyk commented Feb 19, 2021

frankmcsherry commented Feb 19, 2021

frankmcsherry commented Feb 19, 2021

ryzhyk commented Feb 19, 2021

frankmcsherry commented Feb 20, 2021

ruchirK left a comment

Choose a reason for hiding this comment

ruchirK Feb 21, 2021

Choose a reason for hiding this comment

frankmcsherry Feb 21, 2021

Choose a reason for hiding this comment

frankmcsherry commented Feb 21, 2021

ruchirK Feb 21, 2021

Choose a reason for hiding this comment

frankmcsherry Feb 21, 2021

Choose a reason for hiding this comment

frankmcsherry commented Feb 19, 2021 •

edited

Loading