Add query param include_source_on_error for ingest requests #120725
Conversation
…rning the source in case of parsing errors (JSON).
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
|
Hi @mosche, I've created a changelog YAML for you. |
|
Where and how should this be documented? |
prdoyle
left a comment
There was a problem hiding this comment.
I'm a little concerned that the default here is to leak potentially sensitive information, but perhaps that is unavoidable due to legacy compatibility?
Fair point 👍 Maintaining the current default, which is including the source, was an explicit requirement of the ticket based on discussions in our team sync. |
|
Wondering, should |
| BytesReference source, | ||
| XContentType xContentType, | ||
| String routing, | ||
| Map<String, String> dynamicTemplates |
There was a problem hiding this comment.
Similar to the one above, mostly to enforce the correct default flag.
Slowly wondering, includeSourceOnError should probably better be passed along as nullable Boolean. That way the default can be handled in a single place only. Currently the default is all over the place.
As implemented in this PR, yes, we need to add this to the YAML spec for every endpoint whose request may have a body. We could change the implementation slightly and make this parameter truly common to all endpoints, and then we could call it a "common" parameter and document it alongside
I don't think that's possible, the YAML tests should be translated to all sorts of input formats (CBOR, SMILE, ...) not just JSON. But that's not really the point of the YAML tests: these tests are more so that other client implementations can run without needing to run any Java code. So we should have some YAML tests that set this parameter if only to verify that it's accepted (and understood by other clients). |
|
Thanks @DaveCTurner, I've updated the specs. I'll follow up with a separate PR to also use the param in YAML tests. Regarding making this "common", I've decided against that for now. Considering that this is meant to be used in sensitive use cases, I'd rather be sure the parameter is also correctly consumed when provided. That's not always obvious and should better be checked for every endpoint when the behavior is required. |
I don't think this PR achieves that. There's loads of REST handlers which call |
Yes, exactly, that's what I meant to say. Goal of this PR isn't to support the param everywhere, this is why it should not be documented / handled as common query parameter. The usage of But granted, other actions might accept the parameter if using |
Right yeah I think if we're not going to support it everywhere we should be much more selective about where we do support it. Currently this PR adds the extra parameter (but no testing or docs) to over 100 endpoints. |
include_source_on_error include_source_on_error for ingest requests
| final FilterPath[] includes; | ||
| final FilterPath[] excludes; | ||
| final boolean filtersMatchFieldNamesWithDots; | ||
| final Boolean includeSourceOnError; |
There was a problem hiding this comment.
IMO a nullable Boolean is pretty confusing. We should always know whether we want to include source on errors, but null begs the question "what do we do in the null case". Since that case is only for bwc, it's more clear to set that default in the bwc case when constructing from StreamInput.
There was a problem hiding this comment.
Since that case is only for bwc, it's more clear to set that default in the bwc case when constructing from StreamInput.
I don't think bwc is the primary concern. If it was just for reading from StreamInput I'd absolutely agree.
Both BulkRequest and IndexRequest have various ctors (and their usage is not always immediately obvious).
In any case, due to these additional ctors, even if moving the bwc fallback into the StreamInput ctor (using boolean), the field would still require inline initialisation so we don't ever default to false.
To me, the motivation for a nullable Boolean (true / false / undefined, use default) was mostly to have a single consistent definition of that default. But I agree, that can easily be more confusing that helpful. I'll remove that commit to revert back to boolean
|
@rjernst I'll merge with previously discussed change reverted, let me know if there's anything to follow up on. |
💔 Backport failed
You can use sqren/backport to manually backport by running |
By default ES will include the document source in the error message in case of parsing errors.
While this is useful for investigating the issue, it might not be acceptable in certain cases, e.g. to prevent sensitive data being captured in logs.
include_source_on_errorallows to disable the default behavior of including the source in error messages.This is supported for everything using the default RestRequest content parser and specifically also for:
Relates to ES-9186