Reduce level of ESQL results warnings#142112
Conversation
Failing to store the results of an ESQL search does not in itself indicate that the system is running in a degraded state nor that an operator needs to be paged, so this commit reduces these log messages to level `WARN` instead.
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
alex-spies
left a comment
There was a problem hiding this comment.
I had a look at the logs in Serverless that we emitted here, and in the last 7 days saw only CBEs and results too big for storage resulting in the error logs here. (Examples below)
Based on this, I agree that logging at ERROR gives us false positives and logging at WARN is more appropriate.
That said, we'll also lose visibility on potential problems that are not expected.
Maybe we should only WARN for CBEs and RemoteTransportException (casued by EsRejectedExecutionException), but keep the ERROR level otherwise?
@nik9000 , what do you think?
Examples:
org.elasticsearch.transport.RemoteTransportException: [es-es-index-...][...][indices:data/write/bulk[s]]
Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: Request contains an operation of size [337859328] bytes, which exceeds the maximum allowed limit of [209715200] bytes
"org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<reused_arrays>] would be [4044065696/3.7gb], which is larger than the limit of [4032403865/3.7gb], real usage: [4044049312/3.7gb], new bytes reserved: [16384/16kb], usages [inflight_requests=0/0b, model_inference=0/0b, eql_sequence=0/0b, fielddata=337/337b, request=630765612/601.5mb]; for more information, see https://www.elastic.co/docs/troubleshoot/elasticsearch/circuit-breaker-errors?version=current
at org.elasticsearch.server@9.4.0/org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:434)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:126)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.util.BigArrays.adjustBreaker(BigArrays.java:526)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.util.BigArrays.resizeInPlace(BigArrays.java:576)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.util.BigArrays.resize(BigArrays.java:637)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.util.BigArrays.grow(BigArrays.java:655)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.BytesStreamOutput.ensureCapacity(BytesStreamOutput.java:215)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.BytesStreamOutput.writeBytes(BytesStreamOutput.java:105)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.StreamOutput.write(StreamOutput.java:501)
at org.elasticsearch.base@9.4.0/org.elasticsearch.core.Streams$1.write(Streams.java:150)
at java.base/java.util.Base64$EncOutputStream.write(Base64.java:977)
at org.elasticsearch.base@9.4.0/org.elasticsearch.core.Streams$1.write(Streams.java:150)
at java.base/java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:342)
at java.base/java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:273)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.BufferedStreamOutput.writeBytesOverflow(BufferedStreamOutput.java:121)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.BufferedStreamOutput.writeBytes(BufferedStreamOutput.java:106)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.StreamOutput.write(StreamOutput.java:501)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.StreamOutput.writeBytesRef(StreamOutput.java:211)
at org.elasticsearch.compute.data.BytesRefVector.writeTo(BytesRefVector.java:127)
at org.elasticsearch.compute.data.BytesRefBlock.writeTo(BytesRefBlock.java:133)
at org.elasticsearch.compute.data.Block.writeTypedBlock(Block.java:494)
at org.elasticsearch.compute.data.Page.writeTo(Page.java:123)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.StreamOutput.writeWriteable(StreamOutput.java:1050)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.StreamOutput.writeCollection(StreamOutput.java:1131)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.StreamOutput.writeCollection(StreamOutput.java:1121)
at org.elasticsearch.xpack.esql.action.EsqlQueryResponse.writeTo(EsqlQueryResponse.java:195)
at org.elasticsearch.server@9.4.0/org.elasticsearch.common.io.stream.StreamOutput.writeOptionalWriteable(StreamOutput.java:1024)
at org.elasticsearch.xcore@9.4.0/org.elasticsearch.xpack.core.async.StoredAsyncResponse.writeTo(StoredAsyncResponse.java:64)
at org.elasticsearch.xcore@9.4.0/org.elasticsearch.xpack.core.async.AsyncTaskIndexService.lambda$addResultFieldAndFinish$3(AsyncTaskIndexService.java:331)
at org.elasticsearch.xcontent@9.4.0/org.elasticsearch.xcontent.XContentBuilder.lambda$directFieldAsBase64$24(XContentBuilder.java:1266)
at org.elasticsearch.xcontent.impl@9.4.0/org.elasticsearch.xcontent.provider.json.JsonXContentGenerator.writeDirectField(JsonXContentGenerator.java:583)
at org.elasticsearch.xcontent@9.4.0/org.elasticsearch.xcontent.XContentBuilder.directFieldAsBase64(XContentBuilder.java:1260)
at org.elasticsearch.xcore@9.4.0/org.elasticsearch.xpack.core.async.AsyncTaskIndexService.addResultFieldAndFinish(AsyncTaskIndexService.java:324)
at org.elasticsearch.xcore@9.4.0/org.elasticsearch.xpack.core.async.AsyncTaskIndexService.indexResponse(AsyncTaskIndexService.java:265)
at org.elasticsearch.xcore@9.4.0/org.elasticsearch.xpack.core.async.AsyncTaskIndexService.createResponseForEQL(AsyncTaskIndexService.java:227)
at org.elasticsearch.xpack.esql.core.async.AsyncTaskManagementService.storeResults(AsyncTaskManagementService.java:273)
at org.elasticsearch.xpack.esql.core.async.AsyncTaskManagementService.lambda$wrapStoringListener$3(AsyncTaskManagementService.java:238)
at org.elasticsearch.server@9.4.0/org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:258)
at org.elasticsearch.xpack.esql.plugin.TransportEsqlQueryAction.lambda$innerExecute$8(TransportEsqlQueryAction.java:298)
at org.elasticsearch.server@9.4.0/org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:258)
at org.elasticsearch.xpack.esql.execution.PlanExecutor.onQuerySuccess(PlanExecutor.java:121)
at org.elasticsearch.xpack.esql.execution.PlanExecutor.lambda$esql$0(PlanExecutor.java:105)
at org.elasticsearch.server@9.4.0/org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:258)
...
|
It's a fair point Alex, we could indeed just reduce the logging level for expected exceptions. IMO that's ones which unwrap to a Your stack trace suggested another change in this area, not sure how one might go about benchmarking it or anything but I'm throwing it out there anyway: #142290 |
alex-spies
left a comment
There was a problem hiding this comment.
Ok, let's get this in as-is to reduce the noise we see in Serverless.
I'll create a follow-up issue to refine the logging here based on the status. We're doing the same in #142330 for logging genuine errors thrown in the ESQL Driver class.
|
Follow-up issue: #142374 |
I feel much better about quieting down things we understand rather than the catch all.
That seems like a wonderful list. |
Failing to store the results of an ESQL search does not in itself
indicate that the system is running in a degraded state nor that an
operator needs to be paged, so this commit reduces these log messages to
level
WARNinstead.