Skip to content

fix(mapper): handle VOID entries in _ignored_source with FLS#141506

Merged
salvatore-campagna merged 12 commits intoelastic:mainfrom
salvatore-campagna:fix/fls-void-entries-exception-synthetic-source
Feb 3, 2026
Merged

fix(mapper): handle VOID entries in _ignored_source with FLS#141506
salvatore-campagna merged 12 commits intoelastic:mainfrom
salvatore-campagna:fix/fls-void-entries-exception-synthetic-source

Conversation

@salvatore-campagna
Copy link
Copy Markdown
Contributor

Problem

When Field Level Security (FLS) is active on an index using synthetic source, any field with a copy_to target causes search requests to fail with:

[-1:12] Unexpected Break (0xFF) token in definite length (-1) Object at [Source: (byte[])[12 bytes]; byte offset: #12]

The failure path is:

  1. During indexing, copy_to target fields are recorded in _ignored_source as VOID entries (XContentDataHelper.voidValue()): markers indicating the field exists but its data is stored elsewhere (in the copy_to source field).
  2. When a query runs with FLS enabled, FieldSubsetReader iterates over _ignored_source stored fields and calls IgnoredSourceFieldMapper.decodeAsMap for each entry to decide which fields to keep or strip based on FLS access permissions.
  3. decodeAsMap builds a CBOR object containing the field name but writes no value for VOID entries (XContentDataHelper.decodeAndWrite is a no-op). When XContentHelper.convertToMap tries to parse this malformed CBOR, the Jackson CBOR parser encounters an unexpected Break token (0xFF) and throws.

Fix

The core fix is in nameValueToMapped(), which both encoding paths share. It now checks XContentDataHelper.isDataPresent(nameValue.value()) before attempting to parse, returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective decodeAsMap methods, so both are protected by the same check. The only difference is how the null propagates:

  1. Coalesced path: CoalescedIgnoredSourceEncoding.decodeAsMap iterates over entries and skips null results, filtering VOID entries out of the returned list.

  2. Legacy path: LegacyIgnoredSourceEncoding.decodeAsMap returns null directly, and filterValue propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is correct.

Tests

  • testCoalescedDecodeAsMapReturnsNullForVoidEntry: unit test verifying CoalescedIgnoredSourceEncoding.decodeAsMap filters out VOID entries, returning an empty list.
  • testLegacyDecodeAsMapReturnsNullForVoidEntry: unit test verifying LegacyIgnoredSourceEncoding.decodeAsMap returns null for a VOID entry.
  • testSyntheticSourceWithCopyToAndFLSCoalesced: integration test exercising FieldSubsetReader with copy_to, synthetic source, and FLS using the coalesced encoding format.
  • testSyntheticSourceWithCopyToAndFLSLegacy: same integration test but using the legacy encoding format (pre-IGNORED_SOURCE_COALESCED_ENTRIES_WITH_FF).
  • YAML REST tests: two end-to-end scenarios: one with API key FLS + synthetic source + copy_to, and one covering the skip_ignored_source_read workaround.
./gradlew :server:test --tests "org.elasticsearch.index.mapper.IgnoredSourceFieldMapperTests.testCoalescedDecodeAsMapReturnsNullForVoidEntry"
./gradlew :server:test --tests "org.elasticsearch.index.mapper.IgnoredSourceFieldMapperTests.testLegacyDecodeAsMapReturnsNullForVoidEntry"
./gradlew :x-pack:plugin:core:test --tests "org.elasticsearch.xpack.core.security.authz.accesscontrol.FieldSubsetReaderTests.testSyntheticSourceWithCopyToAndFLS*"
./gradlew :x-pack:plugin:yamlRestTest -Dtests.method="test {p0=security/authz_api_keys/30_field_level_security_synthetic_source/*copy_to*}"

When Field Level Security (FLS) processes _ignored_source entries from
documents with synthetic source enabled, it calls decodeAsMap() to parse
and potentially filter field values. However, copy_to targets are
recorded in _ignored_source as VOID entries marking the field as
existing but stored elsewhere (the copy_to source field).

Previously, decodeAsMap() would attempt to parse these VOID entries,
building a CBOR object with a field name but no value. When
convertToMap tried to parse this malformed CBOR, the Jackson parser
encountered an unexpected Break token (0xFF) and threw.

The core fix is in nameValueToMapped(), which both encoding paths
share. It now checks isDataPresent() before attempting to parse,
returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective
decodeAsMap methods, so both are protected by the same check. The only
difference is how the null propagates:

1. Coalesced path: decodeAsMap iterates over entries and skips null
   results, filtering VOID entries out of the returned list.

2. Legacy path: decodeAsMap returns null directly, and filterValue
   propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is
correct.
@salvatore-campagna salvatore-campagna added backport pending backport auto-backport Automatically create backport pull requests when merged v8.19.11 v9.2.5 v9.3.1 labels Jan 29, 2026
@salvatore-campagna salvatore-campagna marked this pull request as ready for review January 29, 2026 11:17
@salvatore-campagna salvatore-campagna added the :StorageEngine/Mapping The storage related side of mappings label Jan 29, 2026
Copy link
Copy Markdown
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggesetion, LGTM otherwise.

}

private static MappedNameValue nameValueToMapped(NameValue nameValue) throws IOException {
if (XContentDataHelper.isDataPresent(nameValue.value()) == false) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be nameValue.hasValue() == false?

@salvatore-campagna salvatore-campagna merged commit fea892d into elastic:main Feb 3, 2026
34 of 35 checks passed
salvatore-campagna added a commit to salvatore-campagna/elasticsearch that referenced this pull request Feb 3, 2026
…#141506)

* fix(mapper): handle VOID entries in _ignored_source with FLS

When Field Level Security (FLS) processes _ignored_source entries from
documents with synthetic source enabled, it calls decodeAsMap() to parse
and potentially filter field values. However, copy_to targets are
recorded in _ignored_source as VOID entries marking the field as
existing but stored elsewhere (the copy_to source field).

Previously, decodeAsMap() would attempt to parse these VOID entries,
building a CBOR object with a field name but no value. When
convertToMap tried to parse this malformed CBOR, the Jackson parser
encountered an unexpected Break token (0xFF) and threw.

The core fix is in nameValueToMapped(), which both encoding paths
share. It now checks isDataPresent() before attempting to parse,
returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective
decodeAsMap methods, so both are protected by the same check. The only
difference is how the null propagates:

1. Coalesced path: decodeAsMap iterates over entries and skips null
   results, filtering VOID entries out of the returned list.

2. Legacy path: decodeAsMap returns null directly, and filterValue
   propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is
correct.
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💔 Backport failed

Status Branch Result
9.3
8.19 Commit could not be cherrypicked due to conflicts
9.2

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 141506

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 141506

salvatore-campagna added a commit to salvatore-campagna/elasticsearch that referenced this pull request Feb 3, 2026
…#141506)

* fix(mapper): handle VOID entries in _ignored_source with FLS

When Field Level Security (FLS) processes _ignored_source entries from
documents with synthetic source enabled, it calls decodeAsMap() to parse
and potentially filter field values. However, copy_to targets are
recorded in _ignored_source as VOID entries marking the field as
existing but stored elsewhere (the copy_to source field).

Previously, decodeAsMap() would attempt to parse these VOID entries,
building a CBOR object with a field name but no value. When
convertToMap tried to parse this malformed CBOR, the Jackson parser
encountered an unexpected Break token (0xFF) and threw.

The core fix is in nameValueToMapped(), which both encoding paths
share. It now checks isDataPresent() before attempting to parse,
returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective
decodeAsMap methods, so both are protected by the same check. The only
difference is how the null propagates:

1. Coalesced path: decodeAsMap iterates over entries and skips null
   results, filtering VOID entries out of the returned list.

2. Legacy path: decodeAsMap returns null directly, and filterValue
   propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is
correct.
elasticsearchmachine pushed a commit that referenced this pull request Feb 3, 2026
#141703)

* fix(mapper): handle VOID entries in _ignored_source with FLS

When Field Level Security (FLS) processes _ignored_source entries from
documents with synthetic source enabled, it calls decodeAsMap() to parse
and potentially filter field values. However, copy_to targets are
recorded in _ignored_source as VOID entries marking the field as
existing but stored elsewhere (the copy_to source field).

Previously, decodeAsMap() would attempt to parse these VOID entries,
building a CBOR object with a field name but no value. When
convertToMap tried to parse this malformed CBOR, the Jackson parser
encountered an unexpected Break token (0xFF) and threw.

The core fix is in nameValueToMapped(), which both encoding paths
share. It now checks isDataPresent() before attempting to parse,
returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective
decodeAsMap methods, so both are protected by the same check. The only
difference is how the null propagates:

1. Coalesced path: decodeAsMap iterates over entries and skips null
   results, filtering VOID entries out of the returned list.

2. Legacy path: decodeAsMap returns null directly, and filterValue
   propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is
correct.
elasticsearchmachine pushed a commit that referenced this pull request Feb 3, 2026
#141705)

* fix(mapper): handle VOID entries in _ignored_source with FLS

When Field Level Security (FLS) processes _ignored_source entries from
documents with synthetic source enabled, it calls decodeAsMap() to parse
and potentially filter field values. However, copy_to targets are
recorded in _ignored_source as VOID entries marking the field as
existing but stored elsewhere (the copy_to source field).

Previously, decodeAsMap() would attempt to parse these VOID entries,
building a CBOR object with a field name but no value. When
convertToMap tried to parse this malformed CBOR, the Jackson parser
encountered an unexpected Break token (0xFF) and threw.

The core fix is in nameValueToMapped(), which both encoding paths
share. It now checks isDataPresent() before attempting to parse,
returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective
decodeAsMap methods, so both are protected by the same check. The only
difference is how the null propagates:

1. Coalesced path: decodeAsMap iterates over entries and skips null
   results, filtering VOID entries out of the returned list.

2. Legacy path: decodeAsMap returns null directly, and filterValue
   propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is
correct.
lukewhiting pushed a commit to lukewhiting/elasticsearch that referenced this pull request Feb 3, 2026
…#141506)

* fix(mapper): handle VOID entries in _ignored_source with FLS

When Field Level Security (FLS) processes _ignored_source entries from
documents with synthetic source enabled, it calls decodeAsMap() to parse
and potentially filter field values. However, copy_to targets are
recorded in _ignored_source as VOID entries marking the field as
existing but stored elsewhere (the copy_to source field).

Previously, decodeAsMap() would attempt to parse these VOID entries,
building a CBOR object with a field name but no value. When
convertToMap tried to parse this malformed CBOR, the Jackson parser
encountered an unexpected Break token (0xFF) and threw.

The core fix is in nameValueToMapped(), which both encoding paths
share. It now checks isDataPresent() before attempting to parse,
returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective
decodeAsMap methods, so both are protected by the same check. The only
difference is how the null propagates:

1. Coalesced path: decodeAsMap iterates over entries and skips null
   results, filtering VOID entries out of the returned list.

2. Legacy path: decodeAsMap returns null directly, and filterValue
   propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is
correct.
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Feb 4, 2026
…#141506)

* fix(mapper): handle VOID entries in _ignored_source with FLS

When Field Level Security (FLS) processes _ignored_source entries from
documents with synthetic source enabled, it calls decodeAsMap() to parse
and potentially filter field values. However, copy_to targets are
recorded in _ignored_source as VOID entries marking the field as
existing but stored elsewhere (the copy_to source field).

Previously, decodeAsMap() would attempt to parse these VOID entries,
building a CBOR object with a field name but no value. When
convertToMap tried to parse this malformed CBOR, the Jackson parser
encountered an unexpected Break token (0xFF) and threw.

The core fix is in nameValueToMapped(), which both encoding paths
share. It now checks isDataPresent() before attempting to parse,
returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective
decodeAsMap methods, so both are protected by the same check. The only
difference is how the null propagates:

1. Coalesced path: decodeAsMap iterates over entries and skips null
   results, filtering VOID entries out of the returned list.

2. Legacy path: decodeAsMap returns null directly, and filterValue
   propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is
correct.
salvatore-campagna added a commit that referenced this pull request Feb 6, 2026
…141506) (#141716)

When Field Level Security (FLS) processes _ignored_source entries from
documents with synthetic source enabled, it calls decodeAsMap() to parse
and potentially filter field values. However, copy_to targets are
recorded in _ignored_source as VOID entries marking the field as
existing but stored elsewhere (the copy_to source field).

Previously, decodeAsMap() would attempt to parse these VOID entries,
building a CBOR object with a field name but no value. When
convertToMap tried to parse this malformed CBOR, the Jackson parser
encountered an unexpected Break token (0xFF) and threw.

The core fix is in nameValueToMapped(), which both encoding paths
share. It now checks isDataPresent() before attempting to parse,
returning null for VOID entries instead of building malformed CBOR.

Both encoding paths call nameValueToMapped() through their respective
decodeAsMap methods, so both are protected by the same check. The only
difference is how the null propagates:

1. Coalesced path: decodeAsMap iterates over entries and skips null
   results, filtering VOID entries out of the returned list.

2. Legacy path: decodeAsMap returns null directly, and filterValue
   propagates the null to FieldSubsetReader, which drops the entry.

In both cases, VOID entries carry no field data, so omitting them is
correct.

Backport note: On 8.19, FieldSubsetReader.java requires an explicit
null check for the return value of decodeAsMap(). On main, this class
was refactored with an IgnoredSourceFormat parameter that handles nulls
through a different code path, so the null check is not needed there.
salvatore-campagna added a commit to salvatore-campagna/elasticsearch that referenced this pull request Mar 23, 2026
…#141506)

Backport of elastic#141506 to 9.1. When Field Level Security is active on an
index using synthetic source, fields with copy_to targets are recorded
in _ignored_source as VOID entries. FieldSubsetReader attempted to parse
these entries as CBOR, causing search failures with an unexpected Break
token (0xFF).

The fix checks hasValue() in decodeAsMap before parsing and returns null
for VOID entries. FieldSubsetReader now skips null entries gracefully.

Also unmutes the test from elastic#142341.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending backport :StorageEngine/Mapping The storage related side of mappings v8.19.11 v9.2.5 v9.3.1 v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants