[9.1](backport #44932) Adding the option to disable the DNS processor failure or success cache#45078
Merged
andrewkroh merged 2 commits into9.1from Jun 27, 2025
Merged
[9.1](backport #44932) Adding the option to disable the DNS processor failure or success cache#45078andrewkroh merged 2 commits into9.1from
andrewkroh merged 2 commits into9.1from
Conversation
This enables use cases that require resolving the current DNS record, regardless of the record's TTL or any previously cached values. It is useful, for example, when monitoring a DNS server or when recorded events must capture the environment's state at a specific moment. When a cache is used, the TTL determines the time frame in which an agent might observe a stale record instead of the current one. This unpredictability can be undesirable when optimizing for rapid time-to-intervention. Disabling the cache has significant throughput implications. The processing time for a single event will be at least the DNS round-trip time. For example, if a DNS request takes 1 ms, the maximum serial throughput is limited to 1000 events/sec. Known use cases for this feature have low throughput requirements. Throughput can be increased by deploying multiple, parallel agents. NOTE: Setting the failure cache TTL to a very low value (e.g., 1ns) achieves a similar, but imperfect, effect. NOTE: While the config allows setting a TTL on the success cache, this option is currently ignored. A future enhancement could honor this setting (e.g., by using min(configured_ttl, record_ttl)), which would align with the behavior of other DNS clients. (cherry picked from commit eee15e7)
6 tasks
|
This pull request doesn't have a |
Contributor
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
vishaangelova
approved these changes
Jun 27, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed commit message
Adds the option to disable the success and failure cache.
Motivation
This is to enable use cases that require capturing the current point in time dns record regardless of cache or ttl of the record. Such as the case of monitoring the dns server, or with recorded events that need to capture the current state of the environment. TTL captures the time frame over which the old value might be used over the current DNS record, in other words the frame time in which the agent might observe the old or new record based upon whenever the previous request was made. This unpredictability can be undesired when optimizing time-to-intervention.
Disabling the cache will have throughput implications, serial processing an event will be greater than DNS roundtrip time. For example if round-trip time to perform an DNS request is 1 ms, max throughput it limited to 1000/sec. Known use cases have are low throughput requirements. Parallelization, by for example deploying multiple agents, can be used to stretch this number. We would urge to reevaluate the use case and the use of the cache at this point.
NOTE: setting the ttl on the failure cache to 1ns achieves a similar, but imperfect effect.
NOTE: setting the ttl on the success cache is a valid option as per code, it is however ignored as also document in the code. in the documentation it is omitted as an option. Honoring setting and the ttl (min(ttl, dns_record_ttl)) is a different route. Similar to other dns client behaviour.
Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.Disruptive User Impact
non known, the default values leave the old behavior intact and the setting to trigger the new behavior is added in this PR
How to test this PR locally
Define the DNS processor, observe cache stats / resolver requests.
Related issues
This is an automatic backport of pull request #44932 done by [Mergify](https://mergify.com).