generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 627
Add built-in Migration Assistant field data type transformation documentation #10649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
natebower
merged 27 commits into
opensearch-project:main
from
AndreKurait:MABuiltInTransformations
Aug 11, 2025
+551
−0
Merged
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
796318e
Add flattened type handling documentation for Migration Assistant
AndreKurait 6cc2bab
Add flattened type conversion link for MA
AndreKurait 0988a72
Add string type deprecation page for MA
AndreKurait 399c0ba
Ad dense_vector to knn transformation
AndreKurait e1944b1
Update for dense_vector support
AndreKurait b440e8d
Cleanup vale comments
AndreKurait 623f8ec
Update for vale
AndreKurait 9abc539
Add section on identifying if cluster has field type
AndreKurait 2620163
Update _migration-assistant/migration-phases/migrate-metadata/index.md
AndreKurait a336464
Update _migration-assistant/migration-phases/migrate-metadata/index.md
AndreKurait dd09f7f
Update _migration-assistant/migration-phases/migrate-metadata/transfo…
AndreKurait 28611f5
Update _migration-assistant/migration-phases/migrate-metadata/transfo…
AndreKurait bd33295
Update _migration-assistant/migration-phases/migrate-metadata/transfo…
AndreKurait 9cf77f6
Update _migration-assistant/migration-phases/migrate-metadata/transfo…
AndreKurait 426ba5c
Update _migration-assistant/migration-phases/migrate-metadata/index.md
AndreKurait 0f14446
Update _migration-assistant/migration-phases/migrate-metadata/index.md
AndreKurait d8946b0
Update _migration-assistant/migration-phases/migrate-metadata/index.md
AndreKurait 20ba536
Update _migration-assistant/migration-phases/migrate-metadata/index.md
AndreKurait a5beb69
Update _migration-assistant/migration-phases/migrate-metadata/transfo…
AndreKurait 24b86dd
Apply suggestions from code review
AndreKurait 9491875
Apply suggestions from code review
AndreKurait 6b3dfb5
Improve documentation on KNN plugin for MA Transform
AndreKurait 6caace0
Improve MA documentation by linking to field type documentation in tr…
AndreKurait db419f1
Apply suggestions from code review
AndreKurait c5aa44e
Adjust wording on metadata migrations
AndreKurait b8b8cd4
Update title Transform string fields to text/keyword
AndreKurait 5257f39
Cleanup string to text logic
AndreKurait File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
210 changes: 210 additions & 0 deletions
210
...ssistant/migration-phases/migrate-metadata/transform-dense-vector-knn-vector.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,210 @@ | ||
| --- | ||
| layout: default | ||
| title: Transform dense_vector fields to knn_vector | ||
| nav_order: 5 | ||
| parent: Migrate metadata | ||
| grand_parent: Migration phases | ||
| permalink: /migration-assistant/migration-phases/migrate-metadata/transform-dense-vector-knn-vector/ | ||
| --- | ||
|
|
||
| # Transform dense_vector fields to knn_vector | ||
|
|
||
|
|
||
| This guide explains how Migration Assistant automatically handles the transformation of Elasticsearch's `dense_vector` field type to OpenSearch's `knn_vector` field type during migration. | ||
|
|
||
| ## Overview | ||
|
|
||
| The `dense_vector` field type was introduced in Elasticsearch 7.x for storing dense vectors used in machine learning and similarity search applications. When migrating from Elasticsearch 7.x to OpenSearch, Migration Assistant automatically converts `dense_vector` fields to OpenSearch's equivalent `knn_vector` type. | ||
|
|
||
| This transformation includes mapping the vector configuration parameters and enabling the necessary OpenSearch k-NN plugin settings. | ||
|
|
||
| To determine whether an Elasticsearch cluster uses `dense_vector` field types, make a call to your source cluster's `GET /_mapping` API. In the migration console, run `console clusters curl source_cluster "/_mapping"`. If you see `"type":"dense_vector"`, then this transformation is applicable and these fields will be automatically transformed during migration. | ||
|
|
||
| ## Compatibility | ||
|
|
||
| The `dense_vector` to `knn_vector` transformation applies to: | ||
| - **Source clusters**: Elasticsearch 7.x+ | ||
| - **Target clusters**: OpenSearch 1.x+ | ||
| - **Automatic conversion**: No configuration required | ||
AndreKurait marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Automatic conversion logic | ||
|
|
||
| Migration Assistant performs the following transformations when converting `dense_vector` to `knn_vector` fields. | ||
|
|
||
| ### Field type transformation | ||
| - Changes `type: "dense_vector"` to `type: "knn_vector"` | ||
| - Maps `dims` parameter to `dimension` | ||
| - Converts similarity metrics to OpenSearch space types | ||
| - Configures the Hierarchical Navigable Small World (HNSW) algorithm with the Lucene engine | ||
|
|
||
| ### Similarity mapping | ||
| The transformation maps Elasticsearch similarity functions to OpenSearch space types: | ||
| - `cosine` → `cosinesimil` | ||
| - `dot_product` → `innerproduct` | ||
| - `l2` (default) → `l2` | ||
|
|
||
| ### Index settings | ||
| When `dense_vector` fields are converted, Migration Assistant automatically performs the following operations: | ||
| - Enables the k-NN plugin by setting `index.knn: true` | ||
| - Ensures proper index configuration for vector search | ||
|
|
||
| ## Migration output | ||
|
|
||
| During the migration process, you'll see this transformation in the output: | ||
|
|
||
| ``` | ||
| Transformations: | ||
| dense_vector to knn_vector: | ||
| Convert field data type dense_vector to OpenSearch knn_vector | ||
| ``` | ||
|
|
||
| ## Transformation behavior | ||
|
|
||
| <table style="border-collapse: collapse; border: 1px solid #ddd;"> | ||
| <thead> | ||
| <tr> | ||
| <th style="border: 1px solid #ddd; padding: 8px;">Source field type</th> | ||
| <th style="border: 1px solid #ddd; padding: 8px;">Target field type</th> | ||
| </tr> | ||
| </thead> | ||
| <tbody> | ||
| <tr> | ||
| <td style="border: 1px solid #ddd; padding: 8px;"> | ||
| <pre><code>{ | ||
| "properties": { | ||
| "embedding": { | ||
| "type": "dense_vector", | ||
| "dims": 128, | ||
| "similarity": "cosine" | ||
| } | ||
| } | ||
| }</code></pre> | ||
| </td> | ||
| <td style="border: 1px solid #ddd; padding: 8px;"> | ||
| <pre><code>{ | ||
| "properties": { | ||
| "embedding": { | ||
| "type": "knn_vector", | ||
| "dimension": 128, | ||
| "method": { | ||
| "name": "hnsw", | ||
| "engine": "lucene", | ||
| "space_type": "cosinesimil", | ||
| "parameters": { | ||
| "encoder": { | ||
| "name": "sq" | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| }</code></pre> | ||
| </td> | ||
| </tr> | ||
| </tbody> | ||
| </table> | ||
|
|
||
| ### HNSW algorithm parameters | ||
|
|
||
| The transformation automatically configures the HNSW algorithm with the following options: | ||
| - `engine`: `lucene` (OpenSearch default) | ||
| - `encoder`: `sq` (scalar quantization for memory efficiency) | ||
| - `method`: `hnsw` (approximate nearest neighbor search) | ||
|
|
||
| ### Index options mapping | ||
|
|
||
| Elasticsearch `index_options` are mapped to OpenSearch HNSW parameters: | ||
| - `m` → `m` (maximum number of connections per node) | ||
| - `ef_construction` → `ef_construction` (size of dynamic candidate list) | ||
|
|
||
| ### Index settings | ||
|
|
||
| When any `dense_vector` fields are converted, the following index setting is automatically added: | ||
|
|
||
| ```json | ||
| { | ||
| "settings": { | ||
| "index.knn": true | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ## Behavior differences | ||
|
|
||
| Migration Assistant automatically transforms all `dense_vector` fields during metadata migration. The k-NN plugin must be installed and enabled on the target OpenSearch cluster. Note: Most OpenSearch distributions include the k-NN plugin in which case no action is needed. | ||
|
|
||
| ### Query compatibility | ||
|
|
||
| After migration, vector search queries need to be updated: | ||
| - Elasticsearch uses `script_score` queries with vector functions. | ||
| - OpenSearch uses native `knn` query syntax. | ||
|
|
||
| **Elasticsearch query example**: | ||
| ```json | ||
| { | ||
| "query": { | ||
| "script_score": { | ||
| "query": {"match_all": {}}, | ||
| "script": { | ||
| "source": "cosineSimilarity(params.query_vector, 'embedding') + 1.0", | ||
| "params": {"query_vector": [0.1, 0.2, 0.3]} | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| **OpenSearch query example**: | ||
| ```json | ||
| { | ||
| "query": { | ||
| "knn": { | ||
| "embedding": { | ||
| "vector": [0.1, 0.2, 0.3], | ||
| "k": 10 | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| If you encounter issues with `dense_vector` conversion: | ||
|
|
||
| 1. **Verify the k-NN plugin** -- Ensure the k-NN plugin is installed and enabled on your target OpenSearch cluster: | ||
| ```bash | ||
| GET /_cat/plugins | ||
| ``` | ||
|
|
||
| 2. **Check migration logs** -- Review the detailed migration logs for any warnings or errors: | ||
| ```bash | ||
| tail /shared-logs-output/migration-console-default/*/metadata/*.log | ||
| ``` | ||
|
|
||
| 3. **Validate mappings** -- After migration, verify that the field types have been correctly converted: | ||
| ```bash | ||
| GET /your-index/_mapping | ||
| ``` | ||
|
|
||
| 4. **Test vector search** -- Verify that vector search functionality works with sample queries: | ||
| ```bash | ||
| POST /your-index/_search | ||
| { | ||
| "query": { | ||
| "knn": { | ||
| "embedding": { | ||
| "vector": [0.1, 0.2, 0.3], | ||
| "k": 5 | ||
| } | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| 5. **Monitor performance** -- Vector search performance may differ between Elasticsearch and OpenSearch. Monitor query performance and adjust HNSW parameters if needed. | ||
|
|
||
| ## Related documentation | ||
|
|
||
| - [Transform field types documentation]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/) -- Configure custom field type transformations. | ||
| - [k-NN documentation]({{site.url}}{{site.baseurl}}/vector-search/vector-search-techniques/approximate-knn/) -- Approximate k-NN search documentation. | ||
116 changes: 116 additions & 0 deletions
116
...-assistant/migration-phases/migrate-metadata/transform-flattened-flat-object.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,116 @@ | ||
| --- | ||
| layout: default | ||
| title: Transform flattened fields to flat_object | ||
| nav_order: 3 | ||
| parent: Migrate metadata | ||
| grand_parent: Migration phases | ||
| permalink: /migration-assistant/migration-phases/migrate-metadata/transform-flattened-flat-object/ | ||
| --- | ||
|
|
||
| # Transform flattened fields to flat_object | ||
|
|
||
| This guide explains how Migration Assistant automatically transforms the `flattened` field type during migration to OpenSearch. | ||
|
|
||
| ## Overview | ||
|
|
||
| The `flattened` field type was introduced in Elasticsearch 7.3 as an X-Pack feature. It allows you to store an entire JSON object as a single field value, which can be useful for objects with a large or unknown number of unique keys. | ||
|
|
||
| When migrating to OpenSearch 2.7 or later, Migration Assistant automatically converts `flattened` field types to OpenSearch's equivalent `flat_object` type. This transformation requires no configuration or user intervention. | ||
|
|
||
| To determine whether an Elasticsearch cluster uses `flattened` field types, make a call to your source cluster's `GET /_mapping` API. In the migration console, run `console clusters curl source_cluster "/_mapping"`. If you see `"type":"flattened"`, then this transformation is applicable and these fields will be automatically transformed during migration. | ||
|
|
||
| ## Compatibility | ||
|
|
||
| The `flattened` to `flat_object` field type transformation applies to: | ||
| - **Source clusters**: Elasticsearch 7.3+ | ||
| - **Target clusters**: OpenSearch 2.7+ | ||
| - **Automatic conversion**: No configuration required during metadata | ||
AndreKurait marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Automatic migration | ||
|
|
||
| When migrating to OpenSearch 2.7 or later, Migration Assistant automatically detects `flattened` field types and converts them to `flat_object` fields. During the migration process, you'll see this transformation in the output: | ||
|
|
||
| ``` | ||
| Transformations: | ||
| flattened to flat_object: | ||
| Convert field data type flattened to OpenSearch flat_object | ||
| ``` | ||
|
|
||
| ### Example transformation | ||
|
|
||
| <table style="border-collapse: collapse; border: 1px solid #ddd;"> | ||
| <thead> | ||
| <tr> | ||
| <th style="border: 1px solid #ddd; padding: 8px;">Source field type</th> | ||
| <th style="border: 1px solid #ddd; padding: 8px;">Target field type</th> | ||
| </tr> | ||
| </thead> | ||
| <tbody> | ||
| <tr> | ||
| <td style="border: 1px solid #ddd; padding: 8px;"> | ||
| <pre><code>{ | ||
| "properties": { | ||
| "labels": { | ||
| "type": "flattened" | ||
| }, | ||
| "title": { | ||
| "type": "text" | ||
| } | ||
| } | ||
| }</code></pre> | ||
| </td> | ||
| <td style="border: 1px solid #ddd; padding: 8px;"> | ||
| <pre><code>{ | ||
| "properties": { | ||
| "labels": { | ||
| "type": "flat_object" | ||
| }, | ||
| "title": { | ||
| "type": "text" | ||
| } | ||
| } | ||
| }</code></pre> | ||
| </td> | ||
| </tr> | ||
| </tbody> | ||
| </table> | ||
|
|
||
| ## Transformation behavior across versions | ||
|
|
||
| Migration Assistant automatically converts all `flattened` fields to `flat_object` fields. No additional configuration is required. | ||
|
|
||
| If you're migrating to OpenSearch versions earlier than 2.7, indexes containing `flattened` field types will fail to migrate. You have several options: | ||
|
|
||
| 1. **Upgrade target cluster**: Upgrade your target OpenSearch cluster to version 2.7 or later to support the automatic conversion. | ||
|
|
||
| 2. **Custom transformation**: Use the [field type transformation framework]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/) to convert `flattened` to another supported type (for example, `object` or `nested`). | ||
|
|
||
| ## Differences between flattened and flat_object | ||
|
|
||
| While `flat_object` in OpenSearch provides similar functionality to Elasticsearch's `flattened` type, there are some minor differences: | ||
|
|
||
| - **Query syntax**: Both support dot notation for accessing nested fields. | ||
| - **Performance**: Similar performance characteristics for indexing and searching. | ||
| - **Storage**: Both store the entire object as a single Lucene field. | ||
| - **Limitations**: Both have similar limitations on aggregations and sorting. | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| If you encounter issues with `flattened` field migration: | ||
|
|
||
| 1. **Verify target version** -- Ensure your target OpenSearch cluster is running version 2.7 or later. | ||
|
|
||
| 2. **Check migration logs** -- Review the detailed migration logs for any warnings or errors: | ||
| ```bash | ||
| cat /shared-logs-output/migration-console-default/*/metadata/*.log | ||
| ``` | ||
|
|
||
| 3. **Validate mappings** -- After migration, verify that the field types have been correctly converted: | ||
| ```bash | ||
| GET /your-index/_mapping | ||
| ``` | ||
|
|
||
| ## Related documentation | ||
|
|
||
| - [Transform field types documentation]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrate-metadata/handling-field-type-breaking-changes/) -- Configure custom field type transformations. | ||
| - [flat_object field type documentation]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/flat-object/) -- Learn about flat_object field type. | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.