Fix NPE when querying pattern_text field in segment with no field values by parkertimmins · Pull Request #142767 · elastic/elasticsearch

parkertimmins · 2026-02-20T15:30:26Z

This PR fixes 3 separate but related related bugs:

Empty segment NPE - This occurs when pattern_text is enabled (disableTemplating=false). When a segment has no documents containing a pattern_text field, PatternTextFallbackDocValues.from() returns null. Both valueFetcher and PatternTextIndexFieldData called methods on this null reference without checking.
Disabled templating NPE - When disableTemplating is true, valueFetcher() and PatternTextIndexFieldData still called PatternTextFallbackDocValues.from() which always returned null (no template_id values), causing NPEs. The blockLoader and getValueFetcherProvider paths already handled it correctly by falling back to binary doc values or stored fields.
When disabledTemplating is true, the getValueFetcherProvider() method (used by SourceIntervalsSource for intervals queries) returns a raw BytesRef. SourceIntervalsSource calls .toString() on the returned value, which on a BytesRef produces hex-encoded output (e.g., [66 6f 6f]) instead of the actual text. This caused intervals queries to never match.

Fixes:

Check if docValues is null in valueFetcher.fetchValues() and PatternTextIndexFieldData
Push the disableTemplate logic into PatternTextFallbackDocValues.from, so that it is applied in all use cases. After this change PatternTextFallbackDocValues.from() handles the two ways the pattern_text can fallback back to using flat binary doc values or stored fields: 1) as a whole column because disableTemplate=true, or 2) per-document due to values being larger than 32kb.
Convert to utf8ToString() in getValueFetcherProvider since all callers can handle strings

elasticsearchmachine · 2026-02-20T15:30:51Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Add tests that verify valueFetcher and fieldData return correct values when disable_templating is true. These tests would have caught the NPE on main where PatternTextFallbackDocValues.from() returns null because template_id doc values are never written when templating is disabled. Also strengthen the existing testFieldDataWithMissingFieldSegment to verify the segment with data returns correct values. Co-authored-by: Cursor <cursoragent@cursor.com>

When disableTemplating is true (basic license), PatternTextFallbackDocValues.from() always returns null because template_id doc values are never written. This caused an NPE via valueFetcher and returned empty values via fieldData for all pattern_text fields on basic license. Add loadDocValues() to PatternTextFieldType that selects the correct BinaryDocValues source based on the templating and storage flags. Both valueFetcher and PatternTextIndexFieldData now share this single dispatch point, eliminating duplicated logic. Co-authored-by: Cursor <cursoragent@cursor.com>

The value fetcher provider was returning raw BytesRef objects from doc values. SourceIntervalsSource calls value.toString() which on BytesRef produces hex output instead of text, causing intervals queries to never match. Co-authored-by: Cursor <cursoragent@cursor.com>

Kubik42 · 2026-02-20T23:09:14Z

...in/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/patterntext/PatternTextFieldType.java

+        return PatternTextFallbackDocValues.from(context.reader(), this);
+    }
+
+    private static BinaryDocValues storedFieldAsBinaryDocValues(LeafReaderContext context, String fieldName) throws IOException {


This feels kind of wrong to do - converting stored fields into doc values. Are you doing this to return BinaryDocValues from loadDocValues()? Can we use PatternTextFallbackDocValues instead?

Wrapping in doc values is just to simplify the valueFetcher logic. So PatternTextFallbackDocValues wraps the proper pattern_text, the binary fallback, and the stored fallback in a single binary doc values. loadDocValues does the same thing, but it makes the decision between the three options at the whole column level rather than on a per-doc basis. So it will have fewer branches since it doesn't require checking the main pattern_text iterator before falling back on each doc.

I think we'll want to wrap the stored field in a doc value iterator, but we might be able to push this down into PatternTextFallbackDocValues in a cleaner way. I'll give it some more thought next week.

If possible I think we should make this decision at the column level. And I think this is possible.

martijnvg

This can occur after a license downgrade from enterprise/trial to basic as existing indices retain disable_templating=false, and any subsequent indexing of documents without the pattern_text field will create segments without the pattern_text field which will then trigger the NPE on search.

I don't think I fully understand. IIRC the idea was the the license change would only affect new indices. So after license downgrade everything should remain to work in the same way in current indices using pattern_text, right? Meaning that new documents being indexed would use pattern text doc values. So does that not happen then?

I do see we don't full test the downgrade scenario in PatternTextLicenseDowngradeIT. After license downgrade, we immediately rollover. Maybe we should add a test that after downgrade keeps index and searching the current backing index?

martijnvg · 2026-02-23T08:50:21Z

...in/logsdb/src/main/java/org/elasticsearch/xpack/logsdb/patterntext/PatternTextFieldType.java

+        return PatternTextFallbackDocValues.from(context.reader(), this);
+    }
+
+    private static BinaryDocValues storedFieldAsBinaryDocValues(LeafReaderContext context, String fieldName) throws IOException {


If possible I think we should make this decision at the column level. And I think this is possible.

Verifies that intervals queries match correctly on pattern_text fields with disable_templating=true.

Move loadDocValues and storedFieldAsBinaryDocValues from PatternTextFieldType into PatternTextFallbackDocValues.from(), which now handles dispatch for both templating-enabled and templating-disabled paths. Update BytesRefsFromBinaryBlockLoader to accept LeafReaderContext so blockLoader() can use the unified entry point directly.

parkertimmins · 2026-02-24T23:18:47Z

@Kubik42 and @martijnvg
I went ahead and updated with the description with some more details. The main idea behind this refactor is to move all the logic which selects between different backing storage into PatternTextFallbackDocValues.from. Though this perhaps adds a bit of overhead from wrapped iterators, and forces stored fields into a doc values interface, I think it is a bit safer.

Before this change there were 5 locations that needed to choose between using PatternTextFallbackDocValues.from, BinaryDocValues directly, or stored fields. There were: PatternTextType.valueFetcher, PatternTextType.getValueFetcherProvider, PatternTextIndexFieldData.loadDirect, PatternTextType.blockLoader, and PatternTextFieldMapper.getSyntheticFieldLoader. With this change, the first four can just use PatternTextFallbackDocValues.from directly and only the synthetic field loader needs to handle it separately.

This reverts commit 9019b19.

martijnvg

I left one minor comment. I also would to see that in PatternTextLicenseDowngradeIT, we better test the downgrade case better. In particular indexing a few docs before and after downgrade. Also executing a search. Then rollover. I'm ok with doing that in a followup PR.

Otherwise LGTM 👍

martijnvg · 2026-02-25T08:34:05Z

...b/src/main/java/org/elasticsearch/xpack/logsdb/patterntext/PatternTextFallbackDocValues.java

        }
    }
+
+    private static BinaryDocValues storedFieldAsBinaryDocValues(LeafReaderContext context, String fieldName) throws IOException {


I don't like hiding stored fields behind the BinaryDocValues abstraction. I think in this case it is acceptable, because it is only for bwc and pulling this out here would increase code complexity. Maybe explain this in a comment here?

elasticsearchmachine · 2026-02-25T17:20:31Z

Hi @parkertimmins, I've created a changelog YAML for you.

Use NoMergePolicy with a direct IndexWriter instead of withLuceneIndex (which uses RandomIndexWriter that can randomly merge segments), ensuring tests that require separate segments per document are deterministic.

elasticsearchmachine · 2026-02-25T20:58:14Z

💔 Backport failed

Status	Branch	Result
❌	9.3	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 142767

Targeted backport of elastic#142767 to 9.3. Fixes three bugs: 1. Empty segment NPE when PatternTextCompositeValues.from() returns null in valueFetcher and PatternTextIndexFieldData. 2. Disabled templating NPE where valueFetcher and PatternTextIndexFieldData called CompositeValues.from() which always returns null without template_id values. 3. BytesRef hex output in getValueFetcherProvider where SourceIntervalsSource received raw BytesRef instead of String, causing intervals queries to never match.

Targeted backport of #142767 to 9.3. Fixes three bugs: 1. Empty segment NPE when PatternTextCompositeValues.from() returns null in valueFetcher and PatternTextIndexFieldData. 2. Disabled templating NPE where valueFetcher and PatternTextIndexFieldData called CompositeValues.from() which always returns null without template_id values. 3. BytesRef hex output in getValueFetcherProvider where SourceIntervalsSource received raw BytesRef instead of String, causing intervals queries to never match.

parkertimmins added 2 commits February 20, 2026 09:23

Add tests which NPE segment has no pattern_text values

7e43341

Return empty doc values if segment has not pattern_text values

23fddb3

parkertimmins requested a review from martijnvg February 20, 2026 15:30

parkertimmins added >bug auto-backport Automatically create backport pull requests when merged :StorageEngine/Codec v9.4.0 v9.3.2 labels Feb 20, 2026

parkertimmins requested a review from jordan-powers February 20, 2026 15:30

elasticsearchmachine added the Team:StorageEngine label Feb 20, 2026

parkertimmins and others added 5 commits February 20, 2026 13:31

Use numbers in test values so will produce pattern_text arg tokens

e1cbc1b

add removed comment

3a0d24b

parkertimmins requested a review from Kubik42 February 20, 2026 22:07

Merge branch 'main' into parker/pattern-text-empty-segment-npe

4e34637

Kubik42 reviewed Feb 20, 2026

View reviewed changes

martijnvg reviewed Feb 23, 2026

View reviewed changes

parkertimmins self-assigned this Feb 24, 2026

parkertimmins and others added 8 commits February 24, 2026 13:17

Add intervals query test for disabled templating

8a70bdf

Verifies that intervals queries match correctly on pattern_text fields with disable_templating=true.

move from method to PatternTextDocValues

9aace57

add back some comments

90c5199

[CI] Auto commit changes from spotless

d31fb5a

Fix Source-confirmed queries bug in separate PR

9019b19

[CI] Auto commit changes from spotless

90af739

Merge branch 'main' into parker/pattern-text-empty-segment-npe

32471d9

parkertimmins requested review from Kubik42 and martijnvg February 24, 2026 23:18

parkertimmins mentioned this pull request Feb 24, 2026

Bug in pattern text where Source-confirmed queries returning raw BytesRef #143006

Closed

parkertimmins added 2 commits February 24, 2026 19:33

Revert "Fix Source-confirmed queries bug in separate PR"

4198a0d

This reverts commit 9019b19.

missing import

ffe707a

martijnvg approved these changes Feb 25, 2026

View reviewed changes

Kubik42 approved these changes Feb 25, 2026

View reviewed changes

Add comment

2c25ef9

parkertimmins added 2 commits February 25, 2026 11:20

Update docs/changelog/142767.yaml

63a9e9b

Merge branch 'main' into parker/pattern-text-empty-segment-npe

68f599b

parkertimmins enabled auto-merge (squash) February 25, 2026 17:34

parkertimmins disabled auto-merge February 25, 2026 17:57

parkertimmins enabled auto-merge (squash) February 25, 2026 17:58

Fix flaky multi-segment PatternText tests

f3de3c6

Use NoMergePolicy with a direct IndexWriter instead of withLuceneIndex (which uses RandomIndexWriter that can randomly merge segments), ensuring tests that require separate segments per document are deterministic.

parkertimmins mentioned this pull request Feb 25, 2026

Improve pattern text downgrade license test #143102

Merged

parkertimmins merged commit 64c1723 into elastic:main Feb 25, 2026
35 checks passed

elasticsearchmachine added the backport pending label Feb 25, 2026

parkertimmins deleted the parker/pattern-text-empty-segment-npe branch February 25, 2026 21:14

parkertimmins mentioned this pull request Feb 25, 2026

[9.3] Fix NPE when querying pattern_text field #143113

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix NPE when querying pattern_text field in segment with no field values#142767

Fix NPE when querying pattern_text field in segment with no field values#142767
parkertimmins merged 22 commits intoelastic:mainfrom
parkertimmins:parker/pattern-text-empty-segment-npe

parkertimmins commented Feb 20, 2026 •

edited

Loading

Uh oh!

elasticsearchmachine commented Feb 20, 2026

Uh oh!

Kubik42 Feb 20, 2026

Uh oh!

parkertimmins Feb 20, 2026 •

edited

Loading

Uh oh!

martijnvg Feb 23, 2026

Uh oh!

martijnvg left a comment

Uh oh!

martijnvg Feb 23, 2026

Uh oh!

parkertimmins commented Feb 24, 2026

Uh oh!

martijnvg left a comment

Uh oh!

martijnvg Feb 25, 2026

Uh oh!

elasticsearchmachine commented Feb 25, 2026

Uh oh!

Uh oh!

elasticsearchmachine commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

parkertimmins commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Feb 20, 2026

Uh oh!

Kubik42 Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

parkertimmins Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martijnvg Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

martijnvg Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

parkertimmins commented Feb 24, 2026

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

martijnvg Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Feb 25, 2026

Uh oh!

Uh oh!

elasticsearchmachine commented Feb 25, 2026

💔 Backport failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

parkertimmins commented Feb 20, 2026 •

edited

Loading

parkertimmins Feb 20, 2026 •

edited

Loading