Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 74 additions & 3 deletions docs/user/interfaces/endpoint.rst
Original file line number Diff line number Diff line change
Expand Up @@ -208,13 +208,13 @@ Explain::
}
}

Cursor
======
Cursor (SQL)
============

Description
-----------

To get paginated response for a query, user needs to provide `fetch_size` parameter as part of normal query. The value of `fetch_size` should be greater than `0`. In absence of `fetch_size` or a value of `0`, it will fallback to non-paginated response. This feature is only available over `jdbc` format for now.
To get paginated response for a SQL query, user needs to provide `fetch_size` parameter as part of normal query. The value of `fetch_size` should be greater than `0`. In absence of `fetch_size` or a value of `0`, it will fallback to non-paginated response. This feature is only available over `jdbc` format for now.

Example
-------
Expand Down Expand Up @@ -266,3 +266,74 @@ Result set::
"size": 5,
"status": 200
}

Fetch Size (PPL) [Experimental]
================================

Description
-----------

The ``fetch_size`` parameter limits the number of rows returned in a PPL query response. The value of ``fetch_size`` should be greater than ``0``. In absence of ``fetch_size`` or a value of ``0``, the result size is governed by the ``plugins.query.size_limit`` cluster setting.

``fetch_size`` can be specified either as a URL parameter or in the JSON request body. If both are provided, the JSON body value takes precedence.

If ``fetch_size`` is larger than ``plugins.query.size_limit``, the result is capped at ``plugins.query.size_limit``. The effective number of rows returned is always ``min(fetch_size, plugins.query.size_limit)``.

Note
----

Unlike SQL's ``fetch_size`` which enables cursor-based pagination, PPL's ``fetch_size`` does not return a cursor and does not support fetching additional pages. The response is always complete and final.

+--------------------+-------------------------------------+------------------------------------+
| Aspect | SQL ``fetch_size`` | PPL ``fetch_size`` |
+====================+=====================================+====================================+
| Purpose | Cursor-based pagination | Response size limiting |
+--------------------+-------------------------------------+------------------------------------+
| Returns cursor? | Yes | No |
+--------------------+-------------------------------------+------------------------------------+
| Can fetch more? | Yes (with cursor) | No (single response) |
+--------------------+-------------------------------------+------------------------------------+

Example 1: JSON body
-------

PPL query::

>> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_ppl -d '{
"fetch_size" : 5,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if fetch_size larger than plugins.query.size_limit . which one should follow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since our implementation for PPL fetch_size API is essentially appending a HEAD command at the end of the query, so if fetch_size is larger than plugins.query.size_limit, it would behave the same as using a HEAD command larger than plugins.query.size_limit, which would follow the cap set by plugins.query.size_limit

"query" : "source = accounts | fields firstname, lastname | where age > 20"
}'

Example 2: URL parameter
-------

PPL query::

>> curl -H 'Content-Type: application/json' -X POST localhost:9200/_plugins/_ppl?fetch_size=5 -d '{
"query" : "source = accounts | fields firstname, lastname | where age > 20"
}'

Result set::

{
"schema": [
{
"name": "firstname",
"type": "text"
},
{
"name": "lastname",
"type": "text"
}
],
"total": 5,
"datarows": [
["Cherry", "Carey"],
["Lindsey", "Hawkins"],
["Sargent", "Powers"],
["Campos", "Olsen"],
["Savannah", "Kirby"]
],
"size": 5,
"status": 200
}
2 changes: 1 addition & 1 deletion docs/user/ppl/limitations/limitations.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ For the following functionalities, the query will be forwarded to the V2 query e
* ML
* Kmeans
* `show datasources` and command
* Commands with `fetch_size` parameter
* SQL queries with `fetch_size` parameter (cursor-based pagination). Note: PPL's `fetch_size` (response size limiting, no cursor) is supported in Calcite Engine.


## Malformed Field Names in Object Fields
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

package org.opensearch.sql.calcite.remote;

import static org.opensearch.sql.legacy.TestUtils.getResponseBody;
import static org.opensearch.sql.legacy.TestsConstants.TEST_INDEX_ACCOUNT;
import static org.opensearch.sql.legacy.TestsConstants.TEST_INDEX_ALIAS;
import static org.opensearch.sql.legacy.TestsConstants.TEST_INDEX_BANK;
Expand All @@ -25,8 +26,12 @@

import java.io.IOException;
import java.util.Locale;
import org.junit.Assert;
import org.junit.Ignore;
import org.junit.Test;
import org.opensearch.client.Request;
import org.opensearch.client.RequestOptions;
import org.opensearch.client.Response;
import org.opensearch.sql.ast.statement.ExplainMode;
import org.opensearch.sql.common.setting.Settings;
import org.opensearch.sql.common.setting.Settings.Key;
Expand Down Expand Up @@ -2497,4 +2502,68 @@ public void testExplainMvCombine() throws IOException {
String expected = loadExpectedPlan("explain_mvcombine.yaml");
assertYamlEqualsIgnoreId(expected, actual);
}

// ==================== fetch_size explain tests ====================

@Test
public void testExplainFetchSizePushDown() throws IOException {
// fetch_size=5 injects Head(5, 0) on top of the plan
// Logical plan: LogicalSort(fetch=[5]) wraps the Project
String expected = loadExpectedPlan("explain_fetch_size_push.yaml");
assertYamlEqualsIgnoreId(
expected,
explainQueryWithFetchSizeYaml(
String.format("source=%s | fields age", TEST_INDEX_ACCOUNT), 5));
}

@Test
public void testExplainFetchSizeWithSmallerHead() throws IOException {
// fetch_size=10 with user's | head 3
// Two LogicalSort nodes: inner fetch=[3] from user head, outer fetch=[10] from fetch_size
// Effective limit = min(3, 10) = 3
String expected = loadExpectedPlan("explain_fetch_size_with_head_push.yaml");
assertYamlEqualsIgnoreId(
expected,
explainQueryWithFetchSizeYaml(
String.format("source=%s | head 3 | fields age", TEST_INDEX_ACCOUNT), 10));
}

@Test
public void testExplainFetchSizeSmallerThanHead() throws IOException {
// fetch_size=5 with user's | head 100
// Two LogicalSort nodes: inner fetch=[100] from user head, outer fetch=[5] from fetch_size
// Effective limit = min(100, 5) = 5
String expected = loadExpectedPlan("explain_fetch_size_smaller_than_head_push.yaml");
assertYamlEqualsIgnoreId(
expected,
explainQueryWithFetchSizeYaml(
String.format("source=%s | head 100 | fields age", TEST_INDEX_ACCOUNT), 5));
}

/**
* Send an explain request with fetch_size in the JSON body and return YAML output.
*
* @param query the PPL query string
* @param fetchSize the fetch_size parameter value
* @return the explain output as YAML string
*/
private String explainQueryWithFetchSizeYaml(String query, int fetchSize) throws IOException {
Request request =
new Request(
"POST",
String.format(
"/_plugins/_ppl/_explain?format=%s&mode=%s", Format.YAML, ExplainMode.STANDARD));
String jsonBody =
String.format(
Locale.ROOT, "{\n \"query\": \"%s\",\n \"fetch_size\": %d\n}", query, fetchSize);
request.setJsonEntity(jsonBody);

RequestOptions.Builder restOptionsBuilder = RequestOptions.DEFAULT.toBuilder();
restOptionsBuilder.addHeader("Content-Type", "application/json");
request.setOptions(restOptionsBuilder);

Response response = client().performRequest(request);
Assert.assertEquals(200, response.getStatusLine().getStatusCode());
return getResponseBody(response, true);
}
}
Loading
Loading