Skip to content

Conversation

@joegallo
Copy link
Contributor

@joegallo joegallo commented Dec 11, 2025

Follow up to #139075

More recent versions of jruby/joni have the ability to specify a timeout for a Matcher's search (see jruby/joni#78). This is quite a bit easier for us to use than the current scheme of thread registration and monitoring in MatcherWatchdog.Default, and it's faster, too (since there's no synchronization overhead of registering/unregistering).

7c81f85 is just my humble opinion, it's surprising to me that it wasn't added back in #31024, but it wasn't. 🤷

4eabe3e is bug which was introduced by me in #95426. 💀


Personally, I am of the opinion that the MatcherWatchdog interface is entirely obsolete and should be removed. However, there's a fairly intricate third implementation of the interface over in x-pack/plugin/text-structure that I'm not qualified to touch, so I resisted the urge to refactor all the way up through removing the interface completely. As it stands, though, the two first-class default implementations really aren't using the methods of the interface in very meaningful ways anymore, but they still do follow the right shape.

joegallo@simulacron:~/Code/elastic/elasticsearch $ git grep 'implements MatcherWatchdog'
libs/grok/src/main/java/org/elasticsearch/grok/MatcherWatchdog.java:    final class Noop implements MatcherWatchdog {
libs/grok/src/main/java/org/elasticsearch/grok/MatcherWatchdog.java:    final class Default implements MatcherWatchdog {
x-pack/plugin/text-structure/src/main/java/org/elasticsearch/xpack/textstructure/structurefinder/TimeoutChecker.java:    static class TimeoutCheckerWatchdog implements MatcherWatchdog {

@joegallo joegallo requested a review from dakrone December 11, 2025 20:52
@joegallo joegallo added >enhancement :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team v9.3.0 labels Dec 11, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @joegallo, I've created a changelog YAML for you.

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for simplifying this, I left some comments but nothing major.

}

if (result == Matcher.INTERRUPTED) {
throw new RuntimeException(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use an ElasticsearchException here instead, since it's the ES version of RuntimeException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grok doesn't have visibility to org.elasticsearch.server, so it can't see that class. We could change that, but I don't want to do that without chatting up a wider audience. I DRYed it in e821c7e to make it just one place and not three, though, so at least the badness is a little more contained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's too bad, makes sense why we don't use it then.

matcherWatchdog.unregister(matcher);
}
if (result == Matcher.INTERRUPTED) {
throw new RuntimeException(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here for exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DRYed in e821c7e.

BiConsumer<Long, Runnable> scheduler
) {
return new Default(interval, maxExecutionTime, relativeTimeSupplier, scheduler);
static MatcherWatchdog newInstance(long maxExecutionTime) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're touching this anyway, can we changed maxExecutionTime in include the units (millis) in its name? Perhaps maxExecutionMillis?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, I went with maxExecutionTimeMillis in 19cc207.

Comment on lines 92 to 95
if (maxExecutionTime > 0) {
matcher.setTimeout(maxExecutionTime * NSEC_PER_MSEC);
} else {
matcher.setTimeout(-1); // disable timeouts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pass in 0 as the time, what is expected by the caller? Is it expected that we would always time out, or that we'd never time out?

Copy link
Contributor Author

@joegallo joegallo Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm, I don't really have a strong opinion about it. I'll poke around for some prior art so that I can claim we're doing something similar to something else common in Elasticsearch.

Do you prefer that 0 means no timeout? I'm fine with just making it do that if that's what you prefer. (edit: wait, that is the current logic)

Comment on lines -178 to -179
long intervalMillis = IngestSettings.GROK_WATCHDOG_INTERVAL.get(settings).getMillis();
long maxExecutionTimeMillis = IngestSettings.GROK_WATCHDOG_INTERVAL.get(settings).getMillis();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of this PR, we should mark this setting as deprecated because it no longer does anything, right? Can you update the PR to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fair. I'll handle that in the morning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I annotated the setting itself as being deprecated in the code, and added a callout in the grok processor documentation as well (I think). That's all in 88c8f93.

private Default(long interval, long maxExecutionTime, LongSupplier relativeTimeSupplier, BiConsumer<Long, Runnable> scheduler) {
this.interval = interval;

private Default(long maxExecutionTime) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here about renaming to include units.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also covered in 19cc207.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 12, 2025

🔍 Preview links for changed docs

@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants