Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCORE-2232 Actually check for unuploaded changes in no_pending_local_changes() #7967

Merged
merged 1 commit into from
Aug 9, 2024

Conversation

tgoyne
Copy link
Member

@tgoyne tgoyne commented Aug 7, 2024

We can have local changesets stored which have already been uploaded and acknowledged by the server, so checking all of the changesets is incorrect. We need to instead only check changesets for versions after the current position of the upload cursor.

I wrote a basic test since we didn't have any tests at all which hit the case where we threw this exception, but I couldn't figure out how to write a standalone test for this scenario (one of the tests I wrote for #7921 hit it very indirectly). For things to break we need to have the server send us a DOWNLOAD with upload_client_version > download_client_version (i.e. it's acknowledged the upload of some version, but needs us to store some older version still for merging purposes) and then try to write a copy before we receive a DOWNLOAD message which bumps download_client_version. In practice this requires that we be uploading while receiving non-bootstrap sync downloads, and even then there's a pretty small window where things are broken.

@tgoyne tgoyne self-assigned this Aug 7, 2024
@cla-bot cla-bot bot added the cla: yes label Aug 7, 2024
@tgoyne tgoyne force-pushed the tg/unuploaded-changesets branch from 77c9cbd to 9e43725 Compare August 7, 2024 23:59
Copy link

coveralls-official bot commented Aug 8, 2024

Pull Request Test Coverage Report for Build thomas.goyne_491

Details

  • 37 of 38 (97.37%) changed or added relevant lines in 2 files are covered.
  • 42 unchanged lines in 8 files lost coverage.
  • Overall coverage decreased (-1.5%) to 89.577%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/sync/noinst/client_history_impl.cpp 5 6 83.33%
Files with Coverage Reduction New Missed Lines %
src/realm/util/serializer.cpp 1 90.43%
src/realm/table_view.cpp 2 92.99%
test/object-store/util/test_file.cpp 2 86.47%
src/realm/mixed.cpp 3 86.46%
src/realm/unicode.cpp 3 83.83%
src/realm/bplustree.cpp 6 72.55%
src/realm/table.cpp 7 90.42%
src/realm/sync/noinst/server/server.cpp 18 74.02%
Totals Coverage Status
Change from base Build 2554: -1.5%
Covered Lines: 145956
Relevant Lines: 162940

💛 - Coveralls

@tgoyne tgoyne requested a review from danieltabacaru August 8, 2024 01:00
size_t base_version = 0;
auto upload_client_version =
version_type(m_arrays->root.get_as_ref_or_tagged(s_progress_upload_client_version_iip).get_as_int());
if (upload_client_version > m_sync_history_base_version)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: size_t base_version = std::max(size_t(upload_client_version - m_sync_history_base_version), 0);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version_type is unsigned so upload_client_version - m_sync_history_base_version will never be smaller than 0.

@danieltabacaru
Copy link
Collaborator

For things to break we need to have the server send us a DOWNLOAD with upload_client_version > download_client_version

I think you mean upload_client_version > last_integrated_client_version

couldn't figure out how to write a standalone test for this scenario

I wrote a test that I think can be used as a base. You can use on_sync_client_event_hook to inspect download messages and write copy at the right time.

TEST_CASE("flx: bundled realm", "[app][flx][baas]") {
    FLXSyncTestHarness harness("flx_bundled_realm");
    auto config = harness.make_test_file();
    auto [promise, future] = util::make_promise_future<void>();
    config.sync_config->on_sync_client_event_hook =
        [promise = util::CopyablePromiseHolder<void>(std::move(promise))](std::weak_ptr<SyncSession>,
                                                                          const SyncClientHookData& data) mutable {
            if (data.event != SyncClientHookEvent::SessionSuspended) {
                return SyncClientHookAction::NoAction;
            }
            promise.get_promise().emplace_value();
            return SyncClientHookAction::NoAction;
        };

    auto realm = Realm::get_shared_realm(config);
    auto table = realm->read_group().get_table("class_TopLevel");
    auto new_subs = realm->get_latest_subscription_set().make_mutable_copy();
    new_subs.insert_or_assign(Query(table));
    auto subs = new_subs.commit();
    subs.get_state_change_notification(sync::SubscriptionSet::State::Complete).get();
    wait_for_upload(*realm);

    nlohmann::json error_body = {
        {"tryAgain", true},           {"message", "fake error"},
        {"shouldClientReset", false}, {"isRecoveryModeDisabled", false},
        {"action", "Transient"},      {"backoffIntervalSec", 900},
        {"backoffMaxDelaySec", 900},  {"backoffMultiplier", 1},
    };
    nlohmann::json test_command = {{"command", "ECHO_ERROR"},
                                   {"args", nlohmann::json{{"errorCode", 229}, {"errorBody", error_body}}}};

    // First we trigger a retryable transient error that should cause the client to try to resume the
    // session in 5 minutes.
    auto test_cmd_res =
        wait_for_future(SyncSession::OnlyForTesting::send_test_command(*realm->sync_session(), test_command.dump()))
            .get();
    REQUIRE(test_cmd_res == "{}");
    future.get();

    realm->begin_transaction();
    CppContext c(realm);
    Object::create(c, realm, "TopLevel",
                   util::Any(AnyDict{
                       {"_id", ObjectId::gen()},
                       {"queryable_str_field", std::string{"foo"}},
                   }));
    realm->commit_transaction();

    harness.load_initial_data([&](SharedRealm realm) {
        CppContext c(realm);
        Object::create(c, realm, "TopLevel",
                       std::any(AnyDict{{"_id", ObjectId::gen()}, {"queryable_str_field", "bar"s}}));
    });

    // Once we're suspended, immediately tell the sync client to resume the session. This should cancel the
    // timer that would have auto-resumed the session.
    realm->sync_session()->handle_reconnect();
    wait_for_download(*realm);
}

Copy link
Collaborator

@danieltabacaru danieltabacaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

We can have local changesets stored which have already been uploaded and
acknoledged by the server, so checking all of the changesets is incorrect. We
need to instead only check changesets for versions after the current position
of the upload cursor.
@tgoyne
Copy link
Member Author

tgoyne commented Aug 9, 2024

You can use on_sync_client_event_hook to inspect download messages and write copy at the right time.

The problem is getting the server to produce the DOWNLOAD message that causes problems in the first place. It requires concurrent uploads and downloads so that the client receives a non-empty download which acknowledges a write but has a changeset based on the version before the write. This seems like it shouldn't be hard to do but every time I try to reduce a complex scenario that causes it to something understandable it stops happening.

@tgoyne tgoyne force-pushed the tg/unuploaded-changesets branch from 9e43725 to 9252ccc Compare August 9, 2024 17:56
@tgoyne tgoyne merged commit 21aa4d5 into master Aug 9, 2024
45 checks passed
@tgoyne tgoyne deleted the tg/unuploaded-changesets branch August 9, 2024 18:49
@github-actions github-actions bot mentioned this pull request Aug 9, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants