Skip to content

[BUG] GROUP BY Missing field returns error #743

@deathjoin

Description

@deathjoin

What is the bug?
When doing GROUP BY field that is missing in some documents OS returns error

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Create index
PUT test-nulls
  1. Put some docs one with missing field
POST _bulk
{ "create": { "_index": "test-nulls", "_id": "1" } }
{ "type": "first", "message": "Hello" }
{ "create": { "_index": "test-nulls", "_id": "2" } }
{ "type": "first", "message": "Hello" }
{ "create": { "_index": "test-nulls", "_id": "3" } }
{ "type": "second", "message": "World" }
{ "create": { "_index": "test-nulls", "_id": "4" } }
{ "message": "Whatever" }
  1. Run SQL (fetch_size isn't important)
POST /_plugins/_sql
{
    "query": "SELECT type, COUNT(*) FROM test-nulls GROUP BY type"
    "fetch_size": 2000
}
  1. See error
{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "invalid value operation on MISSING_VALUE",
    "type": "IllegalStateException"
  },
  "status": 500
}

What is the expected behavior?
I expect missing field would be treated like null and response would be

{
  "columns": [
    { "name": "type", "type": "text" },
    { "name": "COUNT(*)", "type": "long" }
  ],
  "rows": [
    [null, 1],
    ["first", 2],
    ["second", 1]
  ]
}

What is your host/environment?

  • OS: MacOS (Docker compose example from official docs)
  • Version Opensearch 2.1.0
  • Plugins default

Do you have any screenshots?
No

Do you have any additional context?
When selecting without group by value returned as null:

POST /_plugins/_sql
{
    "query": """
SELECT type,message FROM test-nulls
    """
}

Response:

{
  "schema": [
    {
      "name": "type",
      "type": "text"
    },
    {
      "name": "message",
      "type": "text"
    }
  ],
  "total": 4,
  "datarows": [
    [
      "first",
      "Hello"
    ],
    [
      "first",
      "Hello"
    ],
    [
      "second",
      "World"
    ],
    [
      null,
      "Whatever"
    ]
  ],
  "size": 4,
  "status": 200
}

Also node logs:

[2022-08-10T13:22:45,770][INFO ][o.o.s.l.p.RestSqlAction  ] [opensearch-node1] [c71f8fb0-9e9e-45f6-9a84-28ee0a5c4ab2] Incoming request /_plugins/_sql?pretty=true: ( SELECT identifier, COUNT(*) FROM table GROUP BY identifier )
[2022-08-10T13:22:45,818][WARN ][stderr                   ] [opensearch-node1] line 2:7 mismatched input 'type' expecting {'ALL', 'CASE', 'CAST', 'DISTINCT', 'FALSE', 'FIRST', 'LAST', 'LEFT', 'MATCH', 'NOT', 'NULL', 'RIGHT', 'TRUE', 'AVG', 'COUNT', 'MAX', 'MIN', 'SUM', 'VAR_POP', 'VAR_SAMP', 'VARIANCE', 'STD', 'STDDEV', 'STDDEV_POP', 'STDDEV_SAMP', 'SUBSTRING', 'TRIM', 'FULL', 'INTERVAL', 'MICROSECOND', 'SECOND', 'MINUTE', 'HOUR', 'DAY', 'WEEK', 'MONTH', 'QUARTER', 'YEAR', 'ABS', 'ACOS', 'ASCII', 'ASIN', 'ATAN', 'ATAN2', 'CEIL', 'CEILING', 'CONCAT', 'CONCAT_WS', 'CONV', 'COS', 'COT', 'CRC32', 'DATE', 'DATE_FORMAT', 'DATE_ADD', 'DATE_SUB', 'DAYOFMONTH', 'DAYOFWEEK', 'DAYOFYEAR', 'DAYNAME', 'DEGREES', 'E', 'EXP', 'FLOOR', 'FROM_DAYS', 'IF', 'IFNULL', 'ISNULL', 'LENGTH', 'LN', 'LOCATE', 'LOG', 'LOG10', 'LOG2', 'LOWER', 'LTRIM', 'MONTHNAME', 'NULLIF', 'PI', 'POW', 'POWER', 'RADIANS', 'RAND', 'REPLACE', 'ROUND', 'RTRIM', 'SIGN', 'SIN', 'SQRT', 'SUBDATE', 'TAN', 'TIME', 'TIME_TO_SEC', 'TIMESTAMP', 'TRUNCATE', 'TO_DAYS', 'UPPER', 'D', 'T', 'TS', 'DENSE_RANK', 'RANK', 'ROW_NUMBER', 'FIELD', 'MATCHPHRASE', 'MATCH_PHRASE', 'SIMPLE_QUERY_STRING', 'MULTI_MATCH', 'SUBSTR', 'STRCMP', 'ADDDATE', '*', '+', '-', 'MOD', '.', '(', '0', '1', '2', STRING_LITERAL, DECIMAL_LITERAL, REAL_LITERAL, ID, DOUBLE_QUOTE_ID, BACKTICK_QUOTE_ID}
[2022-08-10T13:22:45,833][WARN ][o.o.s.l.e.f.PrettyFormatRestExecutor] [opensearch-node1] Error happened in pretty formatter
java.lang.IllegalStateException: invalid value operation on MISSING_VALUE
	at org.opensearch.sql.legacy.expression.model.ExprValue.value(ExprValue.java:14) ~[legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.executor.format.BindingTupleResultSet.lambda$buildDataRows$1(BindingTupleResultSet.java:49) ~[legacy-2.1.0.0.jar:?]
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197) ~[?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[?:?]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[?:?]
	at org.opensearch.sql.legacy.executor.format.BindingTupleResultSet.buildDataRows(BindingTupleResultSet.java:56) ~[legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.executor.format.BindingTupleResultSet.<init>(BindingTupleResultSet.java:28) ~[legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.executor.format.Protocol.loadResultSet(Protocol.java:87) ~[legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.executor.format.Protocol.<init>(Protocol.java:62) ~[legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.executor.format.PrettyFormatRestExecutor.execute(PrettyFormatRestExecutor.java:70) [legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.executor.format.PrettyFormatRestExecutor.execute(PrettyFormatRestExecutor.java:43) [legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.executor.AsyncRestExecutor.doExecuteWithTimeMeasured(AsyncRestExecutor.java:150) [legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.executor.AsyncRestExecutor.lambda$async$1(AsyncRestExecutor.java:110) [legacy-2.1.0.0.jar:?]
	at org.opensearch.sql.legacy.utils.LogUtils.lambda$withCurrentContext$1(LogUtils.java:59) [legacy-2.1.0.0.jar:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:739) [opensearch-2.1.0.jar:2.1.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]

Metadata

Metadata

Assignees

Labels

SQLenhancementNew feature or requestlegacyIssues related to legacy query engine to be deprecatedv2.5.0'Issues and PRs related to version v2.5.0'

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions