Trim the JSON source in indexing slow logs#38081
Conversation
|
Pinging @elastic/es-core-infra |
danielmitterdorfer
left a comment
There was a problem hiding this comment.
I left a comment. Based on the PR description I also could not grasp the purpose of this. It only became apparent after reading the related issue. Could you please reword the PR description as well so this becomes clearer?
| try { | ||
| String source = XContentHelper.convertToJson(doc.source(), reformat, doc.getXContentType()); | ||
| sb.append(", source[").append(Strings.cleanTruncate(source, maxSourceCharsToLog)).append("]"); | ||
| sb.append(", source[").append(Strings.cleanTruncate(source, maxSourceCharsToLog).trim()).append("]"); |
There was a problem hiding this comment.
I wonder whether it is possible to create source without any leading line breaks so we can avoid the trim() at all?
There was a problem hiding this comment.
once you mentioned that I started to investigate how this could even happen.. It turns out that only when reformat=false then the code is just logging whatever data user provided.
So the trim is necessary.
Also I confirmed that this can only happen for the IndexingSLowLog. For search slow log we repackage user's request into our own object SearchRequest.
danielmitterdorfer
left a comment
There was a problem hiding this comment.
The changes themselves look fine but I think we should not reformat unrelated changes here so I left a bunch of comments around that.
| .put(IndexMetaData.SETTING_NUMBER_OF_REPLICAS, 1) | ||
| .put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, 1) | ||
| .put(indexSettings) | ||
| .build(); |
There was a problem hiding this comment.
That looks a bit odd to me and also unrelated to this change? Can you please restore the old formatting?
There was a problem hiding this comment.
I reverted back to previous formatting. must have done this by accident. sorry
| final IllegalArgumentException cause = (IllegalArgumentException) e.getCause(); | ||
| final String causeExpected = | ||
| "failed to parse setting [" + key + "] with value [NOT A TIME VALUE] as a time value: unit is missing or unrecognized"; | ||
| "failed to parse setting [" + key + "] with value [NOT A TIME VALUE] as a time value: unit is missing or unrecognized"; |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
| settings.updateIndexMetaData(newIndexMeta("index", | ||
| Settings.builder().put(IndexingSlowLog.INDEX_INDEXING_SLOWLOG_THRESHOLD_INDEX_WARN_SETTING.getKey(), "NOT A TIME VALUE") | ||
| .build())); | ||
| .build())); |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
| settings.updateIndexMetaData(newIndexMeta("index", | ||
| Settings.builder().put(IndexingSlowLog.INDEX_INDEXING_SLOWLOG_THRESHOLD_INDEX_INFO_SETTING.getKey(), "NOT A TIME VALUE") | ||
| .build())); | ||
| .build())); |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
| settings.updateIndexMetaData(newIndexMeta("index", | ||
| Settings.builder().put(IndexingSlowLog.INDEX_INDEXING_SLOWLOG_THRESHOLD_INDEX_DEBUG_SETTING.getKey(), "NOT A TIME VALUE") | ||
| .build())); | ||
| .build())); |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
| metaData = newIndexMeta("index", Settings.builder() | ||
| .put(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT) | ||
| .build()); | ||
| .put(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT) |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
| .put(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT) | ||
| .put(IndexingSlowLog.INDEX_INDEXING_SLOWLOG_REFORMAT_SETTING.getKey(), false) | ||
| .build()); | ||
| .put(IndexMetaData.SETTING_VERSION_CREATED, Version.CURRENT) |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
| assertThat(e.getCause(), hasToString(containsString("Unrecognized token 'invalid':" | ||
| + " was expecting ('true', 'false' or 'null')\n" | ||
| + " at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper"))); | ||
| + " was expecting ('true', 'false' or 'null')\n" |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
| ParsedDocument pd = new ParsedDocument(new NumericDocValuesField("version", 1), | ||
| SeqNoFieldMapper.SequenceIDFields.emptySeqID(), "id", | ||
| "test", null, null, source, XContentType.JSON, null); | ||
| "test", null, null, source, XContentType.JSON, null); |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
| public void testSlowLogParsedDocumentPrinterSourceToLog() throws IOException { | ||
| BytesReference source = BytesReference.bytes(JsonXContent.contentBuilder() | ||
| .startObject().field("foo", "bar").endObject()); | ||
| .startObject().field("foo", "bar").endObject()); |
There was a problem hiding this comment.
That looks unrelated to this change? Can you please restore the old formatting?
danielmitterdorfer
left a comment
There was a problem hiding this comment.
Thanks for iterating. LGTM
The '{' as a first character in log line is causing problems for beats when parsing plaintext logs. This can happen if the submitted document has an additional '\n' at the beginning and we are not reformatting.
Trimming the source part of a SlogLog solves that and keeps the logs readable.
closes elastic#38080
The '{' as a first character in log line is causing problems for beats when parsing plaintext logs. This can happen if the submitted document has an additional '\n' at the beginning and we are not reformatting.
Trimming the source part of a SlogLog solves that and keeps the logs readable.
relates #38080
backport #38081
The '{' as a first character in log line is causing problems for beats when parsing plaintext logs.
This is used to distinguish if the logs are in JSON or plaintext (they want to keep the logic consistent for docker and standalone deployments, so file extension can not be used).
When the JSON in source element is not reformatted (
index.indexing.slowlog.reformat: false) and the user send a document with an additional\nlike here:Then the
{will endup in a new lineTrimming the source part of a SlowLog will solve the problem and will keep the logs readable.
This won't happen for Search slow logs, as we repackage the user's request into our own
SearchRequestobject.closes #38080