Skip to content

Comments

Rewriting SO id during migration#97222

Merged
mshustov merged 40 commits intoelastic:masterfrom
mshustov:migration-id-gen
Apr 26, 2021
Merged

Rewriting SO id during migration#97222
mshustov merged 40 commits intoelastic:masterfrom
mshustov:migration-id-gen

Conversation

@mshustov
Copy link
Contributor

@mshustov mshustov commented Apr 15, 2021

Summary

Implementation is based on algorithm changes described in #93002
TLDR: Kibana doesn't reindex SO in ES, but instead loads them from Elasticsearch server, transforms, rewrites id for SO registered with namespace: multiple and namespace: multiple-isolated, writes SO back to Elasticsearch server.

Performance concerns.

Tested on the migration of 100 000 SavedObjects https://github.com/elastic/kibana/blob/6712bc07279c9582094d31eb22ebf72bdfaba2f7/src/core/server/saved_objects/migrationsv2/integration_tests/migration_7.7.2_xpack_100k.test.ts

Performance seems to be better in the worst-case scenario when we have a lot of outdated documents, as readWithPit seems to be more performant than searchForOutdatedDocuments. See logs:

Master

batchSize 1_000: 154sec
migration_test_kibana.log

batchSize 10_000: 104sec

Branch

batchSize 1_000: 97sec
migration_test_kibana.log
batchSize 10_000: 105sec

Checklist

Delete any items that are not applicable to this PR.

@mshustov mshustov added v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.13.0 labels Apr 15, 2021
jest.clearAllMocks();
});

it('creates a V2 migrator that initializes a new index and migrates an existing index', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the implementation is tested in the integration tests

transformRawDocs: TransformRawDocs,
outdatedDocuments: SavedObjectsRawDoc[],
index: string,
refresh: estypes.Refresh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move to named arguments instead, but let's do it in a separate PR for all the actions at once

Copy link
Contributor

@rudolf rudolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick first pass and the implementation looks very solid!

import _ from 'lodash';
import { estypes } from '@elastic/elasticsearch';
import { MigrationEsClient } from './migration_es_client';
import { CountResponse, SearchResponse } from '../../../elasticsearch';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no changes in the file, a simple cleanup

await new Promise((resolve) => setTimeout(resolve, 10000));
});

it('rewrites id deterministically for SO with namespaceType: "multiple" and "multiple-isolated"', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"title": "test_index*"
},
"type": "test_index*"
"type": "index-pattern"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalid type that used to fail migration.

Copy link
Contributor

@jportner jportner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry 😅

@mshustov
Copy link
Contributor Author

@jportner reverted, I will skip the test on 7.x branch then

@mshustov mshustov requested a review from jportner April 26, 2021 07:14
Copy link
Contributor

@jportner jportner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@mshustov mshustov merged commit e6ba8cc into elastic:master Apr 26, 2021
@mshustov mshustov deleted the migration-id-gen branch April 26, 2021 13:57
mshustov added a commit to mshustov/kibana that referenced this pull request Apr 26, 2021
* some typos

* implement an alternative client-side migration algorithm

required to enforce idempotent id generation for SO

* update tests

* lol

* remove unnecessary param from request generic

* remove unused parameter

* optimize search when quierying SO for migration

* fix wrong type in fixtures

* try shard_doc asc

* add an integration test

* cleanup

* track_total_hits: false to improve perf

* add happy path test for transformDocs action

* remove unused types

* fix wrong typing

* add cleanup phase

* add an integration test for cleanup phase

* add unit-tests for cleanup function

* address comments

* Fix functional test

* set defaultIndex before each test. otherwise it is deleted in the first test file during cleanup phase

* sourceIndex: Option.some<> for consistency

* Revert "set defaultIndex before each test. otherwise it is deleted in the first test file during cleanup phase"

This reverts commit a128d7b.

* address comments from Pierre

* fix test

* Revert "fix test"

This reverts commit 97315b6.

* revert min convert version back to 8.0

Co-authored-by: Matthias Wilhelm <matthias.wilhelm@elastic.co>
mshustov added a commit that referenced this pull request Apr 26, 2021
* Rewriting SO id during migration (#97222)

* some typos

* implement an alternative client-side migration algorithm

required to enforce idempotent id generation for SO

* update tests

* lol

* remove unnecessary param from request generic

* remove unused parameter

* optimize search when quierying SO for migration

* fix wrong type in fixtures

* try shard_doc asc

* add an integration test

* cleanup

* track_total_hits: false to improve perf

* add happy path test for transformDocs action

* remove unused types

* fix wrong typing

* add cleanup phase

* add an integration test for cleanup phase

* add unit-tests for cleanup function

* address comments

* Fix functional test

* set defaultIndex before each test. otherwise it is deleted in the first test file during cleanup phase

* sourceIndex: Option.some<> for consistency

* Revert "set defaultIndex before each test. otherwise it is deleted in the first test file during cleanup phase"

This reverts commit a128d7b.

* address comments from Pierre

* fix test

* Revert "fix test"

This reverts commit 97315b6.

* revert min convert version back to 8.0

Co-authored-by: Matthias Wilhelm <matthias.wilhelm@elastic.co>

* skip id rewriting test

Co-authored-by: Matthias Wilhelm <matthias.wilhelm@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_note:skip Skip the PR/issue when compiling release notes v7.14.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants