Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 8 additions & 5 deletions IMPLEMENTATION_PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -340,11 +340,14 @@ Done when:
**Previously:** Task 1.7

Done when:
- [ ] Web search tool with Brave Search API and SearXNG backends, configurable via `netclaw.json`.
- [ ] Web fetch tool with HTML-to-text extraction and output truncation.
- [ ] Shell execution tool with timeout, output truncation, stdin closure, working directory.
- [ ] GitHub CLI tool via `gh` shell-out with structured output parsing and missing dependency handling.
- [ ] Tests for each tool with mocked HTTP/process dependencies.
- [x] Shell execution tool with timeout, output truncation, stdin closure, working directory.
- [x] File read and file write tools with path validation and output truncation.
- [x] Source-generated tool schemas via Roslyn incremental generator (ADR-001).
- [ ] ~~Web search tool~~ — deferred, not needed for minimal viable concept.
- [ ] ~~Web fetch tool~~ — deferred, not needed for minimal viable concept.

> **Note:** GitHub CLI access is handled via `shell_execute` + `gh` — no dedicated tool needed.
> Web search and web fetch deferred — shell + file tools are sufficient to prove the concept.

### Task 1.10: Full provider abstraction with MEAI and fallback

Expand Down
1 change: 1 addition & 0 deletions Netclaw.slnx
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
<Folder Name="/src/">
<Project Path="src/Netclaw.Actors/Netclaw.Actors.csproj" />
<Project Path="src/Netclaw.App/Netclaw.App.csproj" />
<Project Path="src/Netclaw.Channels/Netclaw.Channels.csproj" />
<Project Path="src/Netclaw.Actors.Tests/Netclaw.Actors.Tests.csproj" />
<Project Path="src/Netclaw.Tools.Abstractions/Netclaw.Tools.Abstractions.csproj" />
<Project Path="src/Netclaw.Tools.Generators/Netclaw.Tools.Generators.csproj" />
Expand Down
126 changes: 126 additions & 0 deletions docs/research/agent-gateway-architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Agent Gateway Architecture Research

**Date:** 2026-02-22
**Context:** Architectural research before designing Netclaw's channel abstraction

## Projects Studied

| Project | Language | Stars | Architecture |
|---------|----------|-------|-------------|
| OpenClaw | TypeScript | 218k+ | Gateway + Agent in one Node.js process |
| IronClaw | Rust | 3k | Hub-and-spoke, Axum gateway + Agent loop as Tokio tasks |
| ZeroClaw | Rust | 17k | daemon command runs gateway + channels + agent + scheduler as Tokio tasks |

## Consensus: Single Process, Logical Separation

All three are single-process architectures. None split gateway and agent into
separate processes for normal operation.

### Gateway vs Agent Boundary

All three have a clean logical boundary but no process boundary:

- **OpenClaw**: Gateway is a WebSocket control plane (`ws://127.0.0.1:18789`),
Agent is an embedded "peer" in the same process. External agents can connect
via ACP (stdin/stdout NDJSON protocol), but the default agent is in-process.
- **IronClaw**: Gateway is an Axum HTTP server implementing the `Channel` trait.
Agent loop processes messages from all channels via a merged async stream.
Docker containers are the only separate processes (for sandboxed tool
execution).
- **ZeroClaw**: Same pattern — gateway, channels, agent, scheduler all
supervised as Tokio tasks in one process. Optimized for $10 hardware
(Raspberry Pi Zero).

### Security Perimeter

All three put the security boundary at the gateway/channel layer, not between
gateway and agent:

- **OpenClaw**: DM pairing (default-deny, approve unknown senders), role-based
access (operator/node), tool policy (glob allow/deny), exec approval gates,
optional Docker sandbox.
- **IronClaw**: Device pairing (one-time code → bearer token), WASM sandbox for
tools, Docker sandbox for shell, safety layer (prompt injection defense, secret
leak scanning), tool domain separation (Orchestrator vs Container).
- **ZeroClaw**: Device pairing, three autonomy levels (ReadOnly/Supervised/Full),
command allowlists, forbidden path blocklists, rate limits, approval workflows,
env sanitization on every shell exec.

### CLI/TUI as a Channel

In all three, the CLI/TUI is just another channel — it implements the same
interface as Slack, Discord, etc. and feeds messages into the same pipeline.

- OpenClaw: `openclaw tui` connects to the Gateway WebSocket
- IronClaw: `CliChannel` implements the `Channel` trait, feeds `ChannelMessage`
into same `mpsc::Sender` as all channels
- ZeroClaw: `CliChannel` feeds into shared `mpsc::Sender<ChannelMessage>`

### Tool Execution: Layered Security

All three use layered tool security:

1. **Policy/allowlist filtering** — which tools the agent can see
2. **Runtime approval gates** — user confirms before dangerous operations
3. **Sandbox isolation** — Docker/WASM for shell execution
4. **Output sanitization** — secret scrubbing before feeding back to LLM

### Channel Trait Patterns

**IronClaw** (`Channel` trait in Rust):
```
- name() -> &str
- send(message) -> Result
- listen(tx: mpsc::Sender<ChannelMessage>) -> Result
- health_check() -> bool
- start_typing(recipient) / stop_typing(recipient)
- supports_draft_updates() -> bool
- send_draft() / update_draft() / finalize_draft()
- add_reaction()
```

**ZeroClaw** (`Channel` trait in Rust):
```
- Same push-based pattern: listen() receives an mpsc::Sender
- All channels feed ChannelMessage structs into a shared sender
- daemon supervises all configured channels as concurrent Tokio tasks
```

**OpenClaw** (Node.js channel plugins):
```
- Each channel is a module that normalizes to common message format
- Session identity: {channel}:{kind}:{id}
- Multi-agent routing: different channels can route to different workspaces
```

### Message Identity Patterns

| Project | Session Key Format |
|---------|-------------------|
| OpenClaw | `{channel}:{kind}:{id}` (e.g., `slack:dm:U12345`) |
| IronClaw | `thread_ts: Option<String>` on IncomingMessage |
| ZeroClaw | `thread_ts: Option<String>` on ChannelMessage |
| Netclaw (current) | `{channelId}/{threadTs}` |

### Multi-Process Extensions (Not Default)

While all default to single-process, each has escape hatches:

- **OpenClaw**: ACP bridge (stdin/stdout NDJSON) for external agent processes;
Nodes (macOS/iOS/Android companion apps) connect as WebSocket clients
- **IronClaw**: Docker containers for sandboxed tool execution; worker mode
(`ironclaw worker`) for container instances that connect back to orchestrator
via HTTP; Claude Code bridge mode for delegated coding
- **ZeroClaw**: Docker runtime for tool execution isolation; WASM runtime module
for serverless/edge deployment

## Key Decisions for Netclaw

1. **Stay single-process** — all three validate this for homelab/personal use
2. **TUI goes through the same pipeline as Slack** — just another channel
3. **Security boundary at the channel layer** — pairing/auth before messages
enter, tool policy inside
4. **Channel abstraction is the key interface** — Slack, TUI, HTTP, timers all
implement the same contract
5. **Layered tool security** — policy filtering → approval gates → sandbox
6. **Session identity from the channel** — channel provides the entity key
32 changes: 32 additions & 0 deletions src/Netclaw.Actors/Channels/ChannelInput.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
using Microsoft.Extensions.AI;

namespace Netclaw.Actors.Channels;

/// <summary>
/// Strongly-typed inbound message for the session stream API.
/// Supports multi-modal content via <see cref="AIContent"/> from
/// Microsoft.Extensions.AI.Abstractions.
/// </summary>
public sealed record ChannelInput
{
/// <summary>
/// Identity of the user who sent this message.
/// </summary>
public required string SenderId { get; init; }

/// <summary>
/// Optional message ID for correlation and deduplication.
/// </summary>
public string? MessageId { get; init; }

/// <summary>
/// Message content. Supports text (<see cref="TextContent"/>),
/// images, and other modalities via the <see cref="AIContent"/> hierarchy.
/// </summary>
public required IReadOnlyList<AIContent> Contents { get; init; }

/// <summary>
/// When the message was received by the channel.
/// </summary>
public DateTimeOffset ReceivedAt { get; init; }
}
167 changes: 167 additions & 0 deletions src/Netclaw.Actors/Channels/ChannelPipeline.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
using Akka;
using Akka.Actor;
using Akka.Hosting;
using Akka.Streams;
using Akka.Streams.Dsl;
using Microsoft.Extensions.AI;
using Netclaw.Actors.Hosting;
using Netclaw.Actors.Protocol;

namespace Netclaw.Actors.Channels;

/// <summary>
/// Options for creating a session pipeline.
/// </summary>
public sealed record SessionPipelineOptions
{
/// <summary>
/// Channel type identifier (e.g. "console", "headless", "slack").
/// Used to populate <see cref="MessageSource.ChannelType"/> on inbound messages.
/// </summary>
public required string ChannelType { get; init; }

/// <summary>
/// Which output categories the channel wants to receive.
/// </summary>
public OutputFilter Filter { get; init; } = OutputFilter.Full;
}

/// <summary>
/// Handle to a materialized session. Exposes typed Akka.Streams for
/// bidirectional communication with an LLM session actor — all actor
/// internals (JoinSession, subscriber refs, message routing) are hidden.
/// </summary>
public sealed class MaterializedSession : IAsyncDisposable
{
private readonly SharedKillSwitch _killSwitch;

internal MaterializedSession(
Sink<ChannelInput, NotUsed> input,
Source<SessionOutput, NotUsed> output,
SharedKillSwitch killSwitch)
{
Input = input;
Output = output;
_killSwitch = killSwitch;
}

/// <summary>
/// Input sink. Encapsulates <see cref="ChannelInput"/> →
/// <see cref="SendUserMessage"/> transformation and delivery to the
/// session manager. Channel connects its own Source:
/// <code>
/// Source.Queue&lt;ChannelInput&gt;(16, Backpressure)
/// .ToMat(session.Input, Keep.Left)
/// .Run(system);
/// </code>
/// </summary>
public Sink<ChannelInput, NotUsed> Input { get; }

/// <summary>
/// Output stream backed by a pre-materialized subscriber actor.
/// Channel connects its own Sink:
/// <code>
/// session.Output
/// .To(Sink.ForEach&lt;SessionOutput&gt;(Render))
/// .Run(system);
/// </code>
/// </summary>
public Source<SessionOutput, NotUsed> Output { get; }

/// <summary>
/// Gracefully shuts down both inbound and outbound streams
/// via the shared kill switch.
/// </summary>
public ValueTask DisposeAsync()
{
_killSwitch.Shutdown();
return ValueTask.CompletedTask;
}
}

/// <summary>
/// Factory for creating per-session Akka.Streams pipelines. Injected via DI.
/// Channels call <see cref="CreateAsync"/> to get a <see cref="MaterializedSession"/>
/// without touching actor system internals.
///
/// <para>
/// Internally wires a subscriber actor (via <c>Source.PreMaterialize</c>)
/// and a command sink (via <c>Sink.ActorRef</c>) to the session manager,
/// with a shared <see cref="SharedKillSwitch"/> for coordinated teardown.
/// </para>
/// </summary>
public sealed class SessionPipeline
{
private readonly ActorSystem _system;
private readonly IRequiredActor<SessionManagerActorKey> _sessionManagerProvider;

public SessionPipeline(
ActorSystem system,
IRequiredActor<SessionManagerActorKey> sessionManagerProvider)
{
_system = system;
_sessionManagerProvider = sessionManagerProvider;
}

/// <summary>
/// Creates a materialized session with typed input/output streams.
/// </summary>
/// <param name="sessionId">Session identity (channel owns the naming scheme).</param>
/// <param name="options">Pipeline configuration (channel type, output filter).</param>
/// <param name="cancellationToken">Cancellation token for session manager resolution.</param>
/// <returns>A session handle with <see cref="MaterializedSession.Input"/> and
/// <see cref="MaterializedSession.Output"/> streams.</returns>
public async Task<MaterializedSession> CreateAsync(
SessionId sessionId,
SessionPipelineOptions options,
CancellationToken cancellationToken = default)
{
var sessionManager = await _sessionManagerProvider.GetAsync(cancellationToken);
var killSwitch = KillSwitches.Shared($"session-{sessionId.Value}");

// Pre-materialize subscriber to capture IActorRef before building streams
var (subscriber, responseSource) = Source.ActorRef<SessionOutput>(256, OverflowStrategy.DropHead)
.PreMaterialize(_system);

// Inbound: ChannelInput → SendUserMessage → session manager
var inputSink = Flow.Create<ChannelInput>()
.Select(input => MapToCommand(input, sessionId, options))
.Via(killSwitch.Flow<SendUserMessage>())
.To(Sink.ActorRef<SendUserMessage>(sessionManager, Done.Instance,
ex => new Status.Failure(ex)));

// Outbound: pre-materialized subscriber → kill switch → exposed Source
var outputSource = responseSource
.Via(killSwitch.Flow<SessionOutput>());

// Join the session — subscriber starts receiving output
sessionManager.Tell(new JoinSession
{
SessionId = sessionId,
Subscriber = subscriber,
Filter = options.Filter
});

return new MaterializedSession(inputSink, outputSource, killSwitch);
}

private SendUserMessage MapToCommand(
ChannelInput input, SessionId sessionId, SessionPipelineOptions options)
{
// Extract text content from AIContent list (multi-modal future enhancement)
var textParts = input.Contents.OfType<TextContent>().Select(t => t.Text);
var content = string.Join("\n", textParts);

return new SendUserMessage
{
SessionId = sessionId,
Content = content,
Source = new MessageSource
{
ChannelType = options.ChannelType,
SenderId = input.SenderId,
ReceivedAt = input.ReceivedAt
}
};
}
}
28 changes: 28 additions & 0 deletions src/Netclaw.Actors/Channels/MessageSource.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
namespace Netclaw.Actors.Channels;

/// <summary>
/// Ephemeral metadata describing where a user message originated.
/// Used for ACL checks and audit logging — NOT persisted with the session.
/// </summary>
public sealed record MessageSource
{
/// <summary>
/// Channel type identifier (e.g. "console", "headless", "slack").
/// </summary>
public required string ChannelType { get; init; }

/// <summary>
/// Identity of the sender within the channel (e.g. Slack user ID, "local-user").
/// </summary>
public required string SenderId { get; init; }

/// <summary>
/// Optional channel-specific identifier (e.g. Slack channel ID).
/// </summary>
public string? ChannelId { get; init; }

/// <summary>
/// When the message was received by the channel.
/// </summary>
public DateTimeOffset ReceivedAt { get; init; }
}
4 changes: 1 addition & 3 deletions src/Netclaw.Actors/Netclaw.Actors.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,7 @@

<ItemGroup>
<ProjectReference Include="..\Netclaw.Tools.Abstractions\Netclaw.Tools.Abstractions.csproj" />
<ProjectReference Include="..\Netclaw.Tools.Generators\Netclaw.Tools.Generators.csproj"
OutputItemType="Analyzer"
ReferenceOutputAssembly="false" />
<ProjectReference Include="..\Netclaw.Tools.Generators\Netclaw.Tools.Generators.csproj" OutputItemType="Analyzer" ReferenceOutputAssembly="false" />
</ItemGroup>

</Project>
Loading