-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Today the sort values used to rank each hit in the response are exposed as raw values in an array (response.hits.hit.0.sort).
These values are meant to be copied in search_after request in order to paginate efficiently over a set of results.
By default, the sort value for date and date_nanos field is represented as a long, that's the internal representation that we use for this field. This leaking of internal representation is problematic because the returned value cannot be interpreted without context. date returns the number of milliseconds since epoch while date_nanos returns the number of nanoseconds. In order to fix this discrepancy we'd like to gradually introduce formatted sort values.
At first we'd like to add a format option to any sort value in a search request. Setting a format there would ensure that the sort values in the response would be formatted accordingly:
{
"query": {
"match_all": {}
},
"sort": [
{
"field": "timestamp",
"format": "strict_date_optional_time_nanos"
}
]
}
The same format would also be used to parse the search_after value so that copying the sort values directly in search_after continues to work:
{
"query": {
"match_all": {}
},
"sort": [
{
"field": "timestamp",
"format": "strict_date_optional_time_nanos"
}
],
"search_after": [
"2015-01-01T12:10:30.123456789Z"
]
}
It would be nice to also apply the formatter of the field by default if no format is specified. That would solve the leaking of internal representation entirely but would have more impact on users.