Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the correct type to widen the sort fields when merging top field docs #16860

Closed
bugmakerrrrrr opened this issue Dec 16, 2024 · 2 comments · Fixed by #16881
Closed

Use the correct type to widen the sort fields when merging top field docs #16860

bugmakerrrrrr opened this issue Dec 16, 2024 · 2 comments · Fixed by #16881
Labels
bug Something isn't working Search Search query, autocomplete ...etc v2.19.0 Issues and PRs related to version 2.19.0 v3.0.0 Issues and PRs related to version 3.0.0

Comments

@bugmakerrrrrr
Copy link
Contributor

Describe the bug

To enable numeric sort optimization support for all numeric types, we use double to widen the sort fields in #6424. However, when the sort value exceeds the MAX_SAFE_INTEGER (2^53 – 1), it may result in incorrect sorting.

Related component

Search

To Reproduce

public class FieldSortIT extends ParameterizedDynamicSettingsOpenSearchIntegTestCase {
    public void testSortMixedNumericField() throws Exception {
            internalCluster().ensureAtLeastNumDataNodes(3);
            index("long", Long.MAX_VALUE);
            index("integer", Integer.MAX_VALUE);
            SearchResponse searchResponse = client().prepareSearch("long", "integer")
                .setQuery(matchAllQuery())
                .setSize(10)
                .addSort(SortBuilders.fieldSort("field").order(SortOrder.ASC).sortMode(SortMode.MAX))
                .get();
            assertNoFailures(searchResponse);
            long[] sortValues = new long[10];
            for (int i = 0; i < 10; i++) {
                sortValues[i] = ((Number) searchResponse.getHits().getAt(i).getSortValues()[0]).longValue();
            }
            for (int i = 1; i < 10; i++) {
                assertThat(Arrays.toString(sortValues), sortValues[i - 1], lessThan(sortValues[i]));
            }
        }
    
        private void index(String type, long end) throws Exception {
            assertAcked(
                prepareCreate(type).setMapping(
                    XContentFactory.jsonBuilder()
                        .startObject()
                        .startObject("properties")
                        .startObject("field")
                        .field("type", type)
                        .endObject()
                        .endObject()
                        .endObject()
                ).setSettings(Settings.builder().put("index.number_of_shards", 3).put("index.number_of_replicas", 0))
            );
            ensureGreen(type);
            for (int i = 0; i < 5; i++) {
                client().prepareIndex(type).setId(Integer.toString(i)).setSource("{\"field\" : " + (end - i) + " }", XContentType.JSON).get();
            }
            client().admin().indices().prepareRefresh(type).get();
        }
}

Expected behavior

When the sort values are all integer values, we can use the Long type to widen the sort fields.

To be discussed: 1. When sort values contain both integer values and float values, should we throw an exception or use double to widen sort fields? 2. Should we check the compatibility between sort types when merging top docs?

Additional Details

No response

@bugmakerrrrrr bugmakerrrrrr added bug Something isn't working untriaged labels Dec 16, 2024
@github-actions github-actions bot added the Search Search query, autocomplete ...etc label Dec 16, 2024
@reta
Copy link
Collaborator

reta commented Dec 16, 2024

@bugmakerrrrrr I believe this is listed as the limitation of this data type and is explicitly documented [1]

[1] https://opensearch.org/docs/latest/field-types/supported-field-types/unsigned-long/#limitations

@bugmakerrrrrr
Copy link
Contributor Author

@bugmakerrrrrr I believe this is listed as the limitation of this data type and is explicitly documented [1]

[1] https://opensearch.org/docs/latest/field-types/supported-field-types/unsigned-long/#limitations

@reta Thanks for the update. Actually, this is not only related to unsigned_long. The example I gave above only contains long and integer. Also, as I explained in this comment, the results can be wrong even when sorting only for unsigned_long fields in a single index.

@reta reta added v3.0.0 Issues and PRs related to version 3.0.0 v2.19.0 Issues and PRs related to version 2.19.0 labels Jan 9, 2025
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Search Project Board Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Search Search query, autocomplete ...etc v2.19.0 Issues and PRs related to version 2.19.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants