Remove query.low-memory-killer.delay by sopel39 · Pull Request #22936 · trinodb/trino

sopel39 · 2024-08-05T14:21:00Z

query.low-memory-killer.delay is not needed:

It does not prevent cascades of query kills. That is achieved by isLastKillTargetGone check in ClusterMemoryManager
OOM blocked worker cannot be unblocked by other means other than killing the query. Revocable memory (and spill to disk) also won't cause node to be considered out-of-memory. Hence low-memory-killer does not interfere with spill-to-disk.

Having query.low-memory-killer.delay causes reduced concurrency
on a cluster with higher concurrency and under low-memory situations.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# General
* Remove `query.low-memory-killer.delay` config property which should improve query concurrency
  in low-memory situations. ({issue}`issuenumber`)

electrum · 2024-08-05T14:35:11Z

The delay is there so that memory can be freed on worker nodes. Is there another way this is achieved?

sopel39 · 2024-08-05T15:12:56Z

The delay is there so that memory can be freed on worker nodes. Is there another way this is achieved?

Memory won't release on it's own on the workers.

In the code after changes there is:

        if (!lowMemoryKillers.isEmpty() && outOfMemory && !queryKilled) {
            if (isLastKillTargetGone()) {
                callOomKiller(runningQueries);
            }
            else {
                log.debug("Last killed target is still not gone: %s", lastKillTarget);
            }
        }

so we still wait until the previous target query goes away. The previous code was:

        if (!lowMemoryKillers.isEmpty() &&
                outOfMemory &&
                !queryKilled &&
                nanosSince(lastTimeNotOutOfMemory).compareTo(killOnOutOfMemoryDelay) > 0) {
            if (isLastKillTargetGone()) {
                callOomKiller(runningQueries);
            }
            else {
                log.debug("Last killed target is still not gone: %s", lastKillTarget);
            }
        }

so it would loop over queries (when isLastKillTargetGone is true and node is still OOM) without any extra delay. Hence, the delay was redundant anyway.

query.low-memory-killer.delay is not needed: * It does not prevent cascades of query kills. That is achieved by isLastKillTargetGone check in ClusterMemoryManager * OOM blocked worker cannot be unblocked by other means other than killing the query. Revocable memory (and spill to disk) also won't cause node to be considered out-of-memory. Hence low-memory-killer does not interfere with spill-to-disk. Having query.low-memory-killer.delay causes reduced concurrency on a cluster with higher concurrency and under low-memory situations.

wendigo

Soft approval from me (code looks ok, logic seems sound but I'm not an expert in that area). @electrum ptal

losipiuk

Seems fine to me. Looks like the delay may only be used so we allow node-local memory limits to trigger before cluster level memory limits trigger. But it does not seem important.

sopel39 · 2024-08-12T11:01:16Z

Seems fine to me. Looks like the delay may only be used so we allow node-local memory limits to trigger before cluster level memory limits trigger. But it does not seem important.

max mem per node query limits are enforced io.trino.memory.QueryContext#updateUserMemory even before MemoryPool gets updated on worker

sopel39 requested review from electrum, losipiuk and raunaqmorarka August 5, 2024 14:21

cla-bot bot added the cla-signed label Aug 5, 2024

sopel39 force-pushed the ks/remove_config branch from 354e19f to 7033d72 Compare August 5, 2024 14:23

sopel39 force-pushed the ks/remove_config branch from 7033d72 to 7ee7a1a Compare August 5, 2024 15:14

wendigo approved these changes Aug 6, 2024

View reviewed changes

losipiuk approved these changes Aug 11, 2024

View reviewed changes

sopel39 merged commit 8ec18d2 into trinodb:master Aug 13, 2024

sopel39 deleted the ks/remove_config branch August 13, 2024 09:26

github-actions bot added this to the 454 milestone Aug 13, 2024

colebow mentioned this pull request Aug 14, 2024

Add Trino 454 release notes #22977

Merged

soenkeliebau mentioned this pull request Nov 13, 2024

chore(tracking): Release SDP 24.11.0 stackabletech/issues#647

Closed

83 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove query.low-memory-killer.delay#22936

Remove query.low-memory-killer.delay#22936
sopel39 merged 1 commit intotrinodb:masterfrom
sopel39:ks/remove_config

sopel39 commented Aug 5, 2024 •

edited

Loading

Uh oh!

electrum commented Aug 5, 2024

Uh oh!

sopel39 commented Aug 5, 2024 •

edited

Loading

Uh oh!

wendigo left a comment •

edited

Loading

Uh oh!

losipiuk left a comment

Uh oh!

sopel39 commented Aug 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Conversation

sopel39 commented Aug 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Release notes

Uh oh!

electrum commented Aug 5, 2024

Uh oh!

sopel39 commented Aug 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wendigo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

losipiuk left a comment

Choose a reason for hiding this comment

Uh oh!

sopel39 commented Aug 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

sopel39 commented Aug 5, 2024 •

edited

Loading

sopel39 commented Aug 5, 2024 •

edited

Loading

wendigo left a comment •

edited

Loading