VSCopy: Resume the copy phase consistently from given GTID and lastpk#11103
VSCopy: Resume the copy phase consistently from given GTID and lastpk#11103mattlord merged 20 commits intovitessio:mainfrom
Conversation
…upposed to resume the copy phase consistently Signed-off-by: yoheimuta <yoheimuta@gmail.com>
Signed-off-by: yoheimuta <yoheimuta@gmail.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
mattlord
left a comment
There was a problem hiding this comment.
I think that we need to find a way to implement this high level feature w/o changing the basic copy phase event filtering for all VStreams in uvstreamer (the unified vstreamer abstraction/primitive that everything else uses) as this changes key behavior for VReplication workflows and other things.
I also don't think that change should be necessary as we should be able to mirror the VReplication MoveTables/Reshard behavior here. @rohit-nayak-ps and I can discuss further.
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: yoheimuta <yoheimuta@gmail.com>
Signed-off-by: yoheimuta <yoheimuta@gmail.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
|
@yoheimuta, I made the changes I referred to above. Please review and make any eventual changes, when you get a chance. We can then request @mattlord for a PR review. |
|
@rohit-nayak-ps Thank you for your updates! |
mattlord
left a comment
There was a problem hiding this comment.
Looks good and makes sense to me! I only had some very minor question/comments/suggestions.
I'll push a commit now to mention this new feature in the 16.0 release notes.
We should also add a new VStream page in the 16.0 VReplication docs that can reference the existing copy table design doc. There we can touch on what it is, what use cases it serves, how it works and how to use it. We don't have any docs on the VTGate VStream API feature today (only the somewhat dated design doc) — so many people don't even know that it exists. I can get a PR started for that and add a reference in the description.
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
And the new vstream copy resume work added in: vitessio/vitess#11103 Signed-off-by: Matt Lord <mattalord@gmail.com>
mattlord
left a comment
There was a problem hiding this comment.
My review comments were extremely minor -- really only comment related -- so I'm going to approve now that I pushed an update to the release notes and started on the docs work (which is now linked): vitessio/website#1216
Thank you so much for the time you spent on this @yoheimuta !
And the new vstream copy resume work added in: vitessio/vitess#11103 Signed-off-by: Matt Lord <mattalord@gmail.com>
systay
left a comment
There was a problem hiding this comment.
some nit-picks, but they are not stopping this PR
Signed-off-by: Matt Lord <mattalord@gmail.com>
* Document VStream And the new vstream copy resume work added in: vitessio/vitess#11103 Signed-off-by: Matt Lord <mattalord@gmail.com> * Add more info Signed-off-by: Matt Lord <mattalord@gmail.com> * Add VStreamFlags link Signed-off-by: Matt Lord <mattalord@gmail.com> * Address review comments Signed-off-by: Matt Lord <mattalord@gmail.com> * Document all RPC parameters and flags Signed-off-by: Matt Lord <mattalord@gmail.com> * Various improvements after self review Signed-off-by: Matt Lord <mattalord@gmail.com> Signed-off-by: Matt Lord <mattalord@gmail.com>
…vitessio#11103) * VSCopy: Demonstrate to fail a test case on which the vstream API is supposed to resume the copy phase consistently Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Resume the copy phase consistently from given GTID and lastpk Signed-off-by: yoheimuta <yoheimuta@gmail.com> * Build out the unit test some more Signed-off-by: Matt Lord <mattalord@gmail.com> * Update tests for new behavior Signed-off-by: Matt Lord <mattalord@gmail.com> * Improve comments Signed-off-by: Matt Lord <mattalord@gmail.com> * Limit uvstreamer changes and update test Signed-off-by: Matt Lord <mattalord@gmail.com> * Revert uvstreamer test changes Signed-off-by: Matt Lord <mattalord@gmail.com> * Revert all uvstream changes Signed-off-by: Matt Lord <mattalord@gmail.com> * VCopy: Revert the last three commits Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VCopy: Add a new vstream type that allows picking up where we left off Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VCopy: Revert the unit test change Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VCopy: Fix the end-to-end CI test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * Update logic for setting up uvstreamer based on input vgtid/tablepks. Add more catchup events to test Signed-off-by: Rohit Nayak <rohit@planetscale.com> * Refactor logic to decide if event is to be sent. Enhance unit and e2e tests. Signed-off-by: Rohit Nayak <rohit@planetscale.com> * Don't send events for tables which we can identify as ones we haven't started copy for Signed-off-by: Rohit Nayak <rohit@planetscale.com> * Minor changes after self-review Signed-off-by: Rohit Nayak <rohit@planetscale.com> * Add vstream copy resume to release notes Signed-off-by: Matt Lord <mattalord@gmail.com> * Address review comments Signed-off-by: Matt Lord <mattalord@gmail.com> Signed-off-by: yoheimuta <yoheimuta@gmail.com> Signed-off-by: Matt Lord <mattalord@gmail.com> Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Matt Lord <mattalord@gmail.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com>
* add vtgate flag that explicitly allows vstream copy (#125) * fix fs.BoolVar Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> * VSCopy: Resume the copy phase consistently from given GTID and lastpk (vitessio#11103) * VSCopy: Demonstrate to fail a test case on which the vstream API is supposed to resume the copy phase consistently Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Resume the copy phase consistently from given GTID and lastpk Signed-off-by: yoheimuta <yoheimuta@gmail.com> * Build out the unit test some more Signed-off-by: Matt Lord <mattalord@gmail.com> * Update tests for new behavior Signed-off-by: Matt Lord <mattalord@gmail.com> * Improve comments Signed-off-by: Matt Lord <mattalord@gmail.com> * Limit uvstreamer changes and update test Signed-off-by: Matt Lord <mattalord@gmail.com> * Revert uvstreamer test changes Signed-off-by: Matt Lord <mattalord@gmail.com> * Revert all uvstream changes Signed-off-by: Matt Lord <mattalord@gmail.com> * VCopy: Revert the last three commits Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VCopy: Add a new vstream type that allows picking up where we left off Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VCopy: Revert the unit test change Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VCopy: Fix the end-to-end CI test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * Update logic for setting up uvstreamer based on input vgtid/tablepks. Add more catchup events to test Signed-off-by: Rohit Nayak <rohit@planetscale.com> * Refactor logic to decide if event is to be sent. Enhance unit and e2e tests. Signed-off-by: Rohit Nayak <rohit@planetscale.com> * Don't send events for tables which we can identify as ones we haven't started copy for Signed-off-by: Rohit Nayak <rohit@planetscale.com> * Minor changes after self-review Signed-off-by: Rohit Nayak <rohit@planetscale.com> * Add vstream copy resume to release notes Signed-off-by: Matt Lord <mattalord@gmail.com> * Address review comments Signed-off-by: Matt Lord <mattalord@gmail.com> Signed-off-by: yoheimuta <yoheimuta@gmail.com> Signed-off-by: Matt Lord <mattalord@gmail.com> Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Matt Lord <mattalord@gmail.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> * VSCopy: Send COPY_COMPLETED events when the copy operation is done (vitessio#11740) * VSCopy: Demonstrate to fail a test case on which the vstream API sends new events showing copy completed Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Send new events when the copy operation is done Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Fix typo Signed-off-by: yoheimuta <yoheimuta@gmail.com> * Initialize new map for the 'vstream * from' vtgate sql interface. Make vtadmin web protos Signed-off-by: Rohit Nayak <rohit@planetscale.com> * VSCopy: Make TestVStreamCopyBasic fail fast to avoid the end2end timeout out Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: stop sharing the 't1' table among multiple test cases running concurrently Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: refactor the function signature to be clearer Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: refactor the VEvents sorter to be simpler Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: refactor to stop the sorter from including a fully copied event Signed-off-by: yoheimuta <yoheimuta@gmail.com> Signed-off-by: yoheimuta <yoheimuta@gmail.com> Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> * VSCopy: Enable to copy from all shards in either a specified keyspace or all keyspaces (vitessio#11909) * VSCopy: Demonstrate to fail a test case on which the vstream API request doesn't include keyspace and shard Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Copy from all shards in all keyspaces by specifying only an empty gtid Signed-off-by: yoheimuta <yoheimuta@gmail.com> * tests: Make TestRowCount stable regardless of the number of keyspaces Signed-off-by: yoheimuta <yoheimuta@gmail.com> * tests: Cleanup TestCreateAndDropDatabase correctly to stop TestVStreamCopyWithoutKeyspaceShard from failing when running tests together Signed-off-by: yoheimuta <yoheimuta@gmail.com> * tests: Tweak to fix a comment Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: fix the unit tests when the input vgtid with an empty gtid lacks either keyspace or shard Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Keyspace wildcard selection lines up with the table wildcard selection Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Tests the VCopy with multiple keyspaces and resharding Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Make TestVStreamCopyMultiKeyspaceReshard clearer to check if the streaming two keyspaces works even after reshard Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Return an invalid argument error if shards are unspecified and gtid is neither 'current' nor empty Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Add a test description about its purpose and target Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Remove duplicate literals in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Retain defaultReplicas variable in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Explain why we are setting Match to 'customer.*' in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Remove an unused VStreamFlag for the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Use sentence capitalization in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Verify that we didn't lose any events or get duplicates of the keyspace being reshareded in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Return a value instead of a pointer because there is no need to modify the value Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Add a comment describing what TestVStreamCopyFromAllKeyspacesAndAllShards is doing and why Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Add a comment describing why we expect these specific numbers of events from VStream API Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Tweak the test case name Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Make a utility function to sort COPY_COMPLETED events in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Replace the matcher with a simpler one in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Move the print debug call to the FailNow section in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Use require.NoError in new tests Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Use require instead of t.Fatalf in the test Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Apply the reviewer's suggestion to make the error message easier to read Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Add a comment noting what we're actually testing Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Correct the test comment and elaborate the special-case Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Tweak an error message and a comment Signed-off-by: yoheimuta <yoheimuta@gmail.com> * VSCopy: Adjust to a change in the signature of a test function that was introduced in the main repository Signed-off-by: yoheimuta <yoheimuta@gmail.com> --------- Signed-off-by: yoheimuta <yoheimuta@gmail.com> * attempt unit test fix Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> * update test error expected Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> --------- Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> Signed-off-by: yoheimuta <yoheimuta@gmail.com> Signed-off-by: Matt Lord <mattalord@gmail.com> Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: pbibra <pbibra@slack-corp.com> Co-authored-by: yohei yoshimuta <yoheimuta@gmail.com> Co-authored-by: Matt Lord <mattalord@gmail.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com>
Description
The current vstream API doesn't resume a copy phase given the below
VGTID.Instead, it starts replicating the table from the given
GTIDwhile ignoring thelastpkinformation.This PR allows vstream consumers to deal with the
VGTIDevent as an opaque offset.Given the non-empty GTID and the lastpk, the API is supposed to do the following:
Related Issue(s)
Discussion: https://vitess.slack.com/archives/C0PQY0PTK/p1661155731380559
Implementation
Modify uvstreamer Init
Allow a non-empty position when tablePKs are specified. In this case an initlal catchup+fastforward phase will run before restarting copy phase on incompletely copied tables. (reference: https://vitess.io/docs/design-docs/vreplication/life-of-a-stream/)
Change how we filter events
Since we now run a catchup phase when we restart table copies, it is possible that duplicate events can be sent for rows that are not already copied, since we don't yet have logic to compare primary keys. To minimise this we filter out events for tables whose copy phase has not yet started.
Test changes
New test TestVStreamCopyResume: passes both tablePKs and a specific position to validate new logic
https://github.com/vitessio/vitess/pull/11103/files?diff=unified&w=1#diff-c98b8f318aac471272bf650c7824713c9664b46895380acf322ec3498a2fe578R236
Updated test TestVStreamCopyCompleteFlow to account for change in logic for selecting events to send
Checklist
Pending issues