Skip to content

fix: use persistent container for execution environment commands#2721

Open
lalten wants to merge 6 commits into
ansible:mainfrom
lalten:persistent-ee-container
Open

fix: use persistent container for execution environment commands#2721
lalten wants to merge 6 commits into
ansible:mainfrom
lalten:persistent-ee-container

Conversation

@lalten
Copy link
Copy Markdown

@lalten lalten commented Apr 7, 2026

Description

The execution environment (EE) implementation previously spawned a new ephemeral docker run --rm container for every single background diagnostic and linting command. This caused significant overhead as each invocation paid the full cost of container creation (namespace, overlayFS, veth pair setup) even though all commands ran against the same image and volume mounts.

This PR replaces ephemeral containers with a single persistent background container managed via docker exec / podman exec:

  • A long-lived container (sleep infinity) is started once during initialize() and reused for all subsequent commands via docker exec.
  • Container name is derived from a SHA-256 hash of the workspace URI (als_persistent_<hash12>), preventing collisions and enabling reliable cleanup.
  • ensurePersistentContainerHealthy() performs debounced health checks (5s interval) and auto-restarts dead containers transparently.
  • dispose() cleans up the persistent container on workspace removal, config change, or server shutdown. Wired into WorkspaceManager and AnsibleLanguageService shutdown handler.
  • getCachedCommand / setCachedCommand / clearCommandCache API for callers to avoid redundant docker exec invocations (public, no callers yet).
  • Bug fix: copyPluginDocFiles had a latent race condition — docker cp calls via asyncExec were fire-and-forget inside forEach. Converted to for...of with await.

Replaces ephemeral per-command containers with a single persistent background container reused via docker/podman exec to reduce startup overhead, improve reliability via debounced health checks and auto-restart, add command-result caching, and tie container lifecycle to workspace with deterministic naming and safe cleanup.

  • Replace per-command docker/podman run with persistent container + docker/podman exec
  • Deterministic container name: als_persistent_ of workspace URI
  • Debounced health checks (5s) with automatic restart
  • Public command cache API: getCachedCommand / setCachedCommand / clearCommandCache
  • dispose() removes persistent container on workspace removal, config change, or shutdown
  • Wire shutdown cleanup into AnsibleLanguageService and WorkspaceManager
  • Fix race in copyPluginDocFiles: sequential await for...of for docker cp
  • Remove unused uuid dependency
  • Add tests for initialize(), execInContainer(), dispose(), and command cache

related: #2720

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Ansible Language Server’s execution environment (EE) command execution model to reuse a single long-lived container per workspace (via docker exec / podman exec) instead of spawning an ephemeral container per command, and wires container cleanup into workspace removal and language server shutdown.

Changes:

  • Add persistent container lifecycle management to ExecutionEnvironment (deterministic naming, health checks, disposal) and introduce a simple command-result cache API.
  • Switch EE command execution in CommandRunner from wrapContainerArgs(...) to execInContainer(...).
  • Add workspace-removal and shutdown cleanup hooks, plus new unit tests for the new EE behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
packages/ansible-language-server/src/services/executionEnvironment.ts Implements persistent container execution, health checks, dispose, and command cache; updates plugin-doc copying loop.
packages/ansible-language-server/src/utils/commandRunner.ts Updates EE execution path to use execInContainer and removes the mountPaths parameter.
packages/ansible-language-server/src/services/workspaceManager.ts Disposes EE containers when workspace folders are removed; adds async EE disposal method.
packages/ansible-language-server/src/ansibleLanguageService.ts Disposes EE containers on language server shutdown.
packages/ansible-language-server/test/services/executionEnvironment.test.ts Adds tests for persistent-container initialization, exec command generation, dispose, and cache.
Comments suppressed due to low confidence (1)

packages/ansible-language-server/src/services/executionEnvironment.ts:786

  • copyPluginDocFiles() creates directories with fs.mkdirSync(destPath, { recursive: true }), but destPathFolder is computed and used as the target folder for docker cp. Creating destPath (rather than destPathFolder) is inconsistent and can create the wrong directory structure (or fail if destPath is intended as a file path). Use fs.mkdirSync(destPathFolder, { recursive: true }) to ensure the copy destination exists.
        const destPathFolder = destPath
          .split(path.sep)
          .slice(0, -1)
          .join(path.sep);
        fs.mkdirSync(destPath, { recursive: true });
        const copyCommand = `${this._container_engine} cp ${containerName}:${srcPath} ${destPathFolder}`;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/ansible-language-server/src/utils/commandRunner.ts
Comment thread packages/ansible-language-server/src/services/executionEnvironment.ts Outdated
Comment thread packages/ansible-language-server/src/services/workspaceManager.ts Outdated
@lalten lalten changed the title use persistent container for execution environment commands fix: use persistent container for execution environment commands Apr 7, 2026
@github-actions github-actions Bot added the fix label Apr 7, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 68.42105% with 6 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...ible-language-server/src/ansibleLanguageService.ts 0.00% 2 Missing ⚠️
...ansible-language-server/src/utils/commandRunner.ts 66.66% 1 Missing and 1 partial ⚠️
...nguage-server/src/services/executionEnvironment.ts 88.88% 1 Missing ⚠️
...e-language-server/src/services/workspaceManager.ts 50.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lalten
Copy link
Copy Markdown
Author

lalten commented Apr 7, 2026

The Linux test failure seems to be #2720

@lalten lalten marked this pull request as ready for review April 7, 2026 14:22
@lalten lalten requested a review from a team as a code owner April 7, 2026 14:22
@lalten lalten requested a review from a team as a code owner April 7, 2026 14:22
@github-actions github-actions Bot added fix and removed fix labels Apr 7, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

Refactors execution from ephemeral per-command containers to a deterministic persistent container per workspace, adds lifecycle and health management, command caching, a shutdown handler to dispose execution environments, and removes the uuid dependency.

Changes

Cohort / File(s) Summary
Package manifest
packages/ansible-language-server/package.json
Removed dependency: uuid was deleted from dependencies.
Execution environment implementation
packages/ansible-language-server/src/services/executionEnvironment.ts
Replaced ephemeral run-with---rm model with a single persistent container per workspace (deterministic SHA-256 name), added start/reuse logic, health checks and restarts, dispose() cleanup, command-result cache with getters/setters/clear, execInContainer() public entry, safer host cache path handling, and tightened container name matching.
Workspace & lifecycle changes
packages/ansible-language-server/src/services/workspaceManager.ts, packages/ansible-language-server/src/ansibleLanguageService.ts
Added WorkspaceFolderContext.disposeExecutionEnvironment() and updated workspace removal to asynchronously dispose persistent containers; registered an on-shutdown handler in language server initialization to dispose all contexts.
Command runner integration
packages/ansible-language-server/src/utils/commandRunner.ts
Switched EE branch to call executionEnvironment.execInContainer() instead of wrapContainerArgs(); warns when provided mount paths may be inaccessible inside persistent container.
Tests & helpers
packages/ansible-language-server/test/services/executionEnvironment.test.ts, packages/ansible-language-server/test/helper.ts
Updated tests to cover initialize(), execInContainer(), dispose(), and command cache behavior; disableExecutionEnvironmentSettings() now disposes the execution environment when provided a context before clearing caches.

Sequence Diagram

sequenceDiagram
    participant LS as Language Server
    participant WM as Workspace Manager
    participant EE as Execution Environment
    participant Container as Docker/Podman Container

    Note over LS,Container: Initialization
    LS->>WM: create workspace context
    WM->>EE: initialize / startPersistentContainer
    EE->>Container: start or reuse deterministic container
    Container-->>EE: container running

    Note over LS,Container: Command Execution
    LS->>WM: request command execution
    WM->>EE: execInContainer(command)
    EE->>EE: check command cache
    alt cache hit
        EE-->>WM: return cached result
    else cache miss
        EE->>Container: docker/podman exec <container> <command>
        Container-->>EE: command output
        EE->>EE: store result in cache
        EE-->>WM: return result
    end

    Note over LS,Container: Shutdown
    LS->>LS: onShutdown event
    LS->>WM: disposeExecutionEnvironment (all contexts)
    WM->>EE: dispose()
    EE->>Container: stop/remove container
    Container-->>EE: stopped/removed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped from ephemeral to steady ground,
A single crate where commands abound,
Names are hashed and health checks sing,
Cached replies make engines spring,
I tuck each workspace in at shutdown sound. 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: replacing per-command ephemeral containers with a single persistent container for execution environment commands.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
packages/ansible-language-server/src/services/workspaceManager.ts (1)

106-114: Consider awaiting disposal before deleting context.

void context.disposeExecutionEnvironment() is fire-and-forget, meaning the context is deleted from folderContexts before container cleanup completes. If disposal encounters errors that need the context (e.g., for logging), this could cause issues.

♻️ Suggested fix
-    for (const removedUri of removedUris) {
+    for (const removedUri of removedUris) {
       const context = this.folderContexts.get(removedUri);
       /* v8 ignore next 3 */
       if (context) {
-        void context.disposeExecutionEnvironment();
+        await context.disposeExecutionEnvironment();
       }
       this.folderContexts.delete(removedUri);
     }

Note: This would require making handleWorkspaceChanged async.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ansible-language-server/src/services/workspaceManager.ts` around
lines 106 - 114, The loop that disposes persistent containers currently calls
void context.disposeExecutionEnvironment() which fire-and-forgets and then
deletes the context from folderContexts; change handleWorkspaceChanged to be
async and await context.disposeExecutionEnvironment() for each removedUri (or
collect Promises and await Promise.all) before calling
this.folderContexts.delete(removedUri) so disposal completes (and any errors or
logging that need the context can use it); update any callers of
handleWorkspaceChanged to await it if necessary.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/ansible-language-server/src/services/workspaceManager.ts`:
- Around line 227-236: The test helpers enableExecutionEnvironmentSettings() and
disableExecutionEnvironmentSettings() must explicitly dispose the execution
environment since clearCachedServices() no longer does that; update
disableExecutionEnvironmentSettings() to call disposeExecutionEnvironment() (and
await it if it returns a Promise) before/after calling clearCachedServices() so
any running EE containers are properly cleaned up during teardown, ensuring you
reference the existing clearCachedServices() and disposeExecutionEnvironment()
methods when making the change.

In `@packages/ansible-language-server/src/utils/commandRunner.ts`:
- Around line 27-28: runCommand currently ignores the _mountPaths parameter so
callers like ansibleLint.ts that pass config/document paths end up with those
paths inaccessible; update runCommand to honor the _mountPaths argument by
either (A) passing its entries into startPersistentContainer so the container is
started with additional volume mounts, or (B) validating each path against the
set of already-mounted workspace paths and emitting a clear warning/error when a
path is not covered; specifically modify runCommand (and
startPersistentContainer if needed) to accept and handle mount paths and ensure
callers (e.g., ansibleLint.ts) get correct mount behavior or a visible warning.

---

Nitpick comments:
In `@packages/ansible-language-server/src/services/workspaceManager.ts`:
- Around line 106-114: The loop that disposes persistent containers currently
calls void context.disposeExecutionEnvironment() which fire-and-forgets and then
deletes the context from folderContexts; change handleWorkspaceChanged to be
async and await context.disposeExecutionEnvironment() for each removedUri (or
collect Promises and await Promise.all) before calling
this.folderContexts.delete(removedUri) so disposal completes (and any errors or
logging that need the context can use it); update any callers of
handleWorkspaceChanged to await it if necessary.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 43aec0fe-d6a9-44a2-bf43-bb304ef5e70f

📥 Commits

Reviewing files that changed from the base of the PR and between fdebd9b and 0761ce4.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (6)
  • packages/ansible-language-server/package.json
  • packages/ansible-language-server/src/ansibleLanguageService.ts
  • packages/ansible-language-server/src/services/executionEnvironment.ts
  • packages/ansible-language-server/src/services/workspaceManager.ts
  • packages/ansible-language-server/src/utils/commandRunner.ts
  • packages/ansible-language-server/test/services/executionEnvironment.test.ts
💤 Files with no reviewable changes (1)
  • packages/ansible-language-server/package.json

Comment thread packages/ansible-language-server/src/services/workspaceManager.ts
Comment thread packages/ansible-language-server/src/utils/commandRunner.ts Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
packages/ansible-language-server/src/utils/commandRunner.ts (1)

67-77: ⚠️ Potential issue | 🟠 Major

Don't downgrade missing EE mounts to a warning.

mountPaths is still not honored in EE mode: callers like packages/ansible-language-server/src/services/ansibleLint.ts:50-85 and packages/ansible-language-server/src/services/ansiblePlaybook.ts:59-81 can pass config/document directories outside the workspace, but packages/ansible-language-server/src/services/executionEnvironment.ts:265-301 only mounts the workspace before execInContainer() runs. Logging and continuing just turns that into a later command failure. The current startsWith check also misses sibling paths like /repo2 for a workspace /repo.

Minimal fail-fast alternative if mounting those paths is deferred
-      if (mountPaths && this.connection) {
+      if (mountPaths) {
         const workspacePath = URI.parse(this.context.workspaceFolder.uri).path;
-        for (const mp of mountPaths) {
-          if (!mp.startsWith(workspacePath)) {
-            this.connection.console.warn(
-              `[EE] Mount path '${mp}' is outside the workspace folder and may not be accessible inside the persistent container. ` +
-                `Configure additional volume mounts in the Execution Environment settings.`,
-            );
-          }
+        const workspacePrefix = workspacePath.endsWith("/")
+          ? workspacePath
+          : `${workspacePath}/`;
+        const uncoveredPaths = [...mountPaths].filter(
+          (mp) => mp !== workspacePath && !mp.startsWith(workspacePrefix),
+        );
+        if (uncoveredPaths.length > 0) {
+          const message =
+            `[EE] Persistent container cannot access: ${uncoveredPaths.join(", ")}. ` +
+            `Configure additional volume mounts in the Execution Environment settings.`;
+          this.connection?.console.error(message);
+          throw new Error(message);
         }
       }

Also applies to: 80-81

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ansible-language-server/src/utils/commandRunner.ts` around lines 67
- 77, The mountPaths check silently warns when a requested mount is outside the
workspace (and incorrectly uses startsWith, which treats siblings as inside);
change it to fail fast by rejecting/throwing when any mp is not inside the
workspace and this.connection exists. Replace the startsWith logic with a proper
path containment check (use Node's path.relative or compare workspacePath +
path.sep) to ensure mp is truly a descendant (i.e. path.relative(workspacePath,
mp) does not start with '..' and is not equal to '' wrongly), and call the same
failure behavior where the current code logs a warning (replace
this.connection.console.warn with throwing an Error or returning a rejected
Promise so execInContainer callers cannot proceed). Apply the same fix for the
duplicate case at the other occurrence (lines around the second warn).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/ansible-language-server/src/utils/commandRunner.ts`:
- Around line 67-77: The mountPaths check silently warns when a requested mount
is outside the workspace (and incorrectly uses startsWith, which treats siblings
as inside); change it to fail fast by rejecting/throwing when any mp is not
inside the workspace and this.connection exists. Replace the startsWith logic
with a proper path containment check (use Node's path.relative or compare
workspacePath + path.sep) to ensure mp is truly a descendant (i.e.
path.relative(workspacePath, mp) does not start with '..' and is not equal to ''
wrongly), and call the same failure behavior where the current code logs a
warning (replace this.connection.console.warn with throwing an Error or
returning a rejected Promise so execInContainer callers cannot proceed). Apply
the same fix for the duplicate case at the other occurrence (lines around the
second warn).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c1383065-831a-4729-892a-24dcb553dd56

📥 Commits

Reviewing files that changed from the base of the PR and between 0761ce4 and e2024e7.

📒 Files selected for processing (2)
  • packages/ansible-language-server/src/utils/commandRunner.ts
  • packages/ansible-language-server/test/helper.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants