Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
1cdd211
feat: Add distributed workers mode with artifact sharing
thomhurst Feb 22, 2026
7e7b51c
fix: Add missing Artifacts files excluded by .gitignore
thomhurst Feb 22, 2026
1a8c9dd
feat: Move solution builds into BuildSolutionsModule with artifact sh…
thomhurst Feb 22, 2026
fa383f4
fix: Address code review feedback on distributed workers
thomhurst Feb 22, 2026
52bed84
fix: Remove analyzers build step and make BuildSolutionsModule deps o…
thomhurst Feb 22, 2026
6d9f0f5
fix: Address round 3 code review feedback
thomhurst Feb 22, 2026
4d942de
fix: Activate distributed mode by fixing plugin service ordering
thomhurst Feb 22, 2026
d5bc677
refactor: Merge distributed execution into core ModularPipelines package
thomhurst Feb 22, 2026
0e22b7d
fix: Update Redis dequeue tests to match scan-based implementation
thomhurst Feb 22, 2026
dfcb11c
feat: Replace polling with pub/sub, strip heartbeats, add completion …
thomhurst Feb 22, 2026
bb43f55
fix: Remove ModularPipelines.Distributed from build project list
thomhurst Feb 22, 2026
6c9c878
fix: Address round 7 code review feedback
thomhurst Feb 22, 2026
8f17df6
fix: Revert Upstash Redis port to 6379 (TLS via ssl=True flag)
thomhurst Feb 22, 2026
6ec3ef3
fix: Address round 9 code review items #2, #3, #10, #11, #12
thomhurst Feb 22, 2026
7774b4a
fix: Activate distributed mode by using standard Options pattern
thomhurst Feb 23, 2026
fada714
fix: Disable payload signing for R2/S3-compatible artifact uploads
thomhurst Feb 23, 2026
b8d35d1
fix: Remove S3 object tags (R2 unsupported) and fail fast on artifact…
thomhurst Feb 23, 2026
576ea4a
fix: Signal workers to stop when master crashes
thomhurst Feb 23, 2026
973639a
fix: Cascade worker failures to master for fast pipeline termination
thomhurst Feb 23, 2026
eabd21f
fix: Prevent artifact race condition and harden worker failure handling
thomhurst Feb 23, 2026
02ce7c8
fix: Register distributed module results so pipeline status reflects …
thomhurst Feb 23, 2026
3567ec4
fix: Add worker readiness barrier and fix flaky hook tests
thomhurst Feb 23, 2026
351dda0
fix: Require linux capability for RunUnitTestsModule
thomhurst Feb 23, 2026
6042c35
fix: Allow empty BranchName in detached HEAD (CI merge ref checkout)
thomhurst Feb 23, 2026
945d176
fix: Pin modules that use GetModule<T>() to master to prevent cross-p…
thomhurst Feb 23, 2026
799d23c
Revert "fix: Pin modules that use GetModule<T>() to master to prevent…
thomhurst Feb 23, 2026
1915ec9
feat: Add dependency result propagation and SignalR coordinator for d…
thomhurst Feb 23, 2026
0345f4c
fix: Cap dependency result size to prevent Redis payload overflow
thomhurst Feb 23, 2026
642c959
fix: Use GZip compression instead of stripping for large dependency r…
thomhurst Feb 23, 2026
7c87ce1
fix: Clear distributed env vars from test subprocesses to prevent hang
thomhurst Feb 23, 2026
017e5e9
fix: Add JSON converters for File and Folder to enable safe cross-pro…
thomhurst Feb 23, 2026
8693b28
perf: Split RunUnitTestsModule into per-project modules for distribut…
thomhurst Feb 23, 2026
f552ec7
fix: Pin FindProjects modules to master to prevent cross-instance pat…
thomhurst Feb 23, 2026
7a89d8d
Revert "fix: Pin FindProjects modules to master to prevent cross-inst…
thomhurst Feb 23, 2026
edfa797
fix: Portable path serialization for cross-platform distributed mode
thomhurst Feb 23, 2026
6efd795
refactor: Remove PinToMaster — master participates as worker via work…
thomhurst Feb 23, 2026
c6b9285
fix: Address 7 code review items for distributed mode robustness
thomhurst Feb 23, 2026
bed6e88
fix: Use SignalR for coordination, Redis only for discovery
thomhurst Feb 23, 2026
a050bca
fix: Defer coordinator/artifact factory resolution to first use
thomhurst Feb 24, 2026
c783271
fix: Wire real SignalR hub context, consistent role detection, centra…
thomhurst Feb 24, 2026
fc5e25f
feat: Add cloudflared tunnel for cross-VM SignalR connectivity
thomhurst Feb 24, 2026
73fc231
fix: Retry worker connection for tunnel DNS propagation
thomhurst Feb 24, 2026
3025ff7
fix: Create fresh HubConnectionBuilder on each retry attempt
thomhurst Feb 24, 2026
b20b10c
fix: Match JSON serialization options between SignalR server and client
thomhurst Feb 24, 2026
82d797f
fix: Use CreateBuilder instead of CreateSlimBuilder for SignalR JSON …
thomhurst Feb 24, 2026
8fa3abe
fix: Scope Redis discovery by GITHUB_RUN_ID and match client JSON pro…
thomhurst Feb 24, 2026
5796572
fix: Use HashSet<string> instead of IReadOnlySet<string> in records f…
thomhurst Feb 24, 2026
2ed044e
test: Add SignalR integration tests for full serialization round-trip
thomhurst Feb 24, 2026
d188c36
test: Add multi-worker routing and end-to-end integration tests
thomhurst Feb 24, 2026
881f81b
fix: Increase default connection timeout to 120s for DNS propagation
thomhurst Feb 24, 2026
db00be1
fix: Update default timeout test assertion and use null-object patter…
thomhurst Feb 24, 2026
541ffb2
fix: Validate artifact size on chunked download to detect expired chunks
thomhurst Feb 24, 2026
976e647
fix: Address code review feedback — MatrixTarget, busy-poll, tunnel r…
thomhurst Feb 24, 2026
c7eef5f
fix: Revert cloudflared regex to trycloudflare.com — broad regex matc…
thomhurst Feb 24, 2026
bc84860
refactor: Address code review — volatile fields, O(1) lookup, dedupli…
thomhurst Feb 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 23 additions & 8 deletions .github/workflows/dotnet.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,12 @@ on:
workflow_dispatch:
inputs:
publish-packages:
description: Publish packages?
description: Publish packages?
type: boolean
required: true
default: false
is-alpha:
description: Alpha version?
description: Alpha version?
type: boolean
required: true
default: true
Expand All @@ -22,21 +22,29 @@ env:
jobs:
pipeline:
environment: ${{ github.ref == 'refs/heads/main' && 'Production' || 'Pull Requests' }}
strategy:
strategy:
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
include:
- os: ubuntu-latest
instance: 0
- os: windows-latest
instance: 1
- os: macos-latest
instance: 2
fail-fast: false
runs-on: ${{ matrix.os }}
env:
env:
NUGET_PACKAGES: ${{ matrix.os == 'windows-latest' && 'E:\nuget' || null }}

steps:
- name: Add mask
run: |
echo "::add-mask::${{ secrets.DOTNET_FORMAT_PUSH_TOKEN }}"
echo "::add-mask::${{ secrets.DOTNET_FORMAT_PUSH_TOKEN }}"
echo "::add-mask::${{ secrets.NuGet__ApiKey }}"
echo "::add-mask::${{ secrets.ADMIN_TOKEN }}"
echo "::add-mask::${{ secrets.CODACY_APIKEY }}"
echo "::add-mask::${{ secrets.R2_ACCESS_KEY }}"
echo "::add-mask::${{ secrets.R2_SECRET_KEY }}"
- uses: actions/checkout@v6
with:
fetch-depth: 0
Expand Down Expand Up @@ -66,7 +74,7 @@ jobs:
do
dotnet build $SOLUTION -c Release
done

- name: Run Pipeline
run: dotnet run -c Release --framework net10.0
working-directory: "src/ModularPipelines.Build"
Expand All @@ -83,6 +91,13 @@ jobs:
Codacy__ApiKey: ${{ secrets.CODACY_APIKEY }}
CodeCov__Token: ${{ secrets.CODECOV_TOKEN }}
EMAIL_PASSWORD: ${{ secrets.EMAIL_PASSWORD }}
INSTANCE_INDEX: ${{ matrix.instance }}
TOTAL_INSTANCES: 3
UPSTASH_REDIS_REST_URL: ${{ secrets.UPSTASH_REDIS_REST_URL }}
UPSTASH_REDIS_REST_TOKEN: ${{ secrets.UPSTASH_REDIS_REST_TOKEN }}
R2_ENDPOINT_URL: ${{ secrets.R2_ENDPOINT_URL }}
R2_ACCESS_KEY: ${{ secrets.R2_ACCESS_KEY }}
R2_SECRET_KEY: ${{ secrets.R2_SECRET_KEY }}

- name: Upload Hang Dumps
if: always()
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ BenchmarkDotNet.Artifacts/
project.lock.json
project.fragment.lock.json
artifacts/
!src/**/Artifacts/
!test/**/Artifacts/

# ASP.NET Scaffolding
ScaffoldingReadMe.txt
Expand Down
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,4 +119,4 @@ The build pipeline (`src/ModularPipelines.Build/`) demonstrates best practices:
- Tests run with code coverage collection enabled
- Coverage reports uploaded to Codacy and Codecov
- Test projects identified by "*UnitTests.csproj" pattern
- remember the correct filter syntax
- remember the correct filter syntax
1 change: 1 addition & 0 deletions Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@
<PackageVersion Include="Slack.Webhooks" Version="1.1.5" />
<PackageVersion Include="Sourcy.DotNet" Version="1.1.1" />
<PackageVersion Include="Spectre.Console" Version="0.54.0" />
<PackageVersion Include="StackExchange.Redis" Version="2.11.3" />
<PackageVersion Include="StyleCop.Analyzers" Version="1.2.0-beta.556" />
<PackageVersion Include="System.Text.Json" Version="9.0.6" />
<PackageVersion Include="TestableIO.System.IO.Abstractions" Version="22.1.0" />
Expand Down
90 changes: 90 additions & 0 deletions ModularPipelines.sln
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,18 @@ Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ModularPipelines.Syft", "sr
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ModularPipelines.Grype", "src\ModularPipelines.Grype\ModularPipelines.Grype.csproj", "{60E4E82D-7BBF-4513-80ED-36A2273BB97D}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ModularPipelines.Distributed", "src\ModularPipelines.Distributed\ModularPipelines.Distributed.csproj", "{FEE8EBAE-189B-4988-9C0D-12483838673A}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ModularPipelines.Distributed.UnitTests", "test\ModularPipelines.Distributed.UnitTests\ModularPipelines.Distributed.UnitTests.csproj", "{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ModularPipelines.Distributed.Redis", "src\ModularPipelines.Distributed.Redis\ModularPipelines.Distributed.Redis.csproj", "{C6028374-2C99-4770-A8DE-58DEAD1DE854}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ModularPipelines.Distributed.Redis.UnitTests", "test\ModularPipelines.Distributed.Redis.UnitTests\ModularPipelines.Distributed.Redis.UnitTests.csproj", "{97523284-1AB2-4BB1-84A2-818E103C6DE7}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ModularPipelines.Distributed.Artifacts.S3", "src\ModularPipelines.Distributed.Artifacts.S3\ModularPipelines.Distributed.Artifacts.S3.csproj", "{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ModularPipelines.Distributed.Artifacts.S3.UnitTests", "test\ModularPipelines.Distributed.Artifacts.S3.UnitTests\ModularPipelines.Distributed.Artifacts.S3.UnitTests.csproj", "{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Expand Down Expand Up @@ -807,6 +819,78 @@ Global
{60E4E82D-7BBF-4513-80ED-36A2273BB97D}.Release|x64.Build.0 = Release|Any CPU
{60E4E82D-7BBF-4513-80ED-36A2273BB97D}.Release|x86.ActiveCfg = Release|Any CPU
{60E4E82D-7BBF-4513-80ED-36A2273BB97D}.Release|x86.Build.0 = Release|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Debug|Any CPU.Build.0 = Debug|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Debug|x64.ActiveCfg = Debug|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Debug|x64.Build.0 = Debug|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Debug|x86.ActiveCfg = Debug|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Debug|x86.Build.0 = Debug|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Release|Any CPU.ActiveCfg = Release|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Release|Any CPU.Build.0 = Release|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Release|x64.ActiveCfg = Release|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Release|x64.Build.0 = Release|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Release|x86.ActiveCfg = Release|Any CPU
{FEE8EBAE-189B-4988-9C0D-12483838673A}.Release|x86.Build.0 = Release|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Debug|Any CPU.Build.0 = Debug|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Debug|x64.ActiveCfg = Debug|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Debug|x64.Build.0 = Debug|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Debug|x86.ActiveCfg = Debug|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Debug|x86.Build.0 = Debug|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Release|Any CPU.ActiveCfg = Release|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Release|Any CPU.Build.0 = Release|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Release|x64.ActiveCfg = Release|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Release|x64.Build.0 = Release|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Release|x86.ActiveCfg = Release|Any CPU
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749}.Release|x86.Build.0 = Release|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Debug|Any CPU.Build.0 = Debug|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Debug|x64.ActiveCfg = Debug|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Debug|x64.Build.0 = Debug|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Debug|x86.ActiveCfg = Debug|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Debug|x86.Build.0 = Debug|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Release|Any CPU.ActiveCfg = Release|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Release|Any CPU.Build.0 = Release|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Release|x64.ActiveCfg = Release|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Release|x64.Build.0 = Release|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Release|x86.ActiveCfg = Release|Any CPU
{C6028374-2C99-4770-A8DE-58DEAD1DE854}.Release|x86.Build.0 = Release|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Debug|Any CPU.Build.0 = Debug|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Debug|x64.ActiveCfg = Debug|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Debug|x64.Build.0 = Debug|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Debug|x86.ActiveCfg = Debug|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Debug|x86.Build.0 = Debug|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Release|Any CPU.ActiveCfg = Release|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Release|Any CPU.Build.0 = Release|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Release|x64.ActiveCfg = Release|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Release|x64.Build.0 = Release|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Release|x86.ActiveCfg = Release|Any CPU
{97523284-1AB2-4BB1-84A2-818E103C6DE7}.Release|x86.Build.0 = Release|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Debug|Any CPU.Build.0 = Debug|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Debug|x64.ActiveCfg = Debug|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Debug|x64.Build.0 = Debug|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Debug|x86.ActiveCfg = Debug|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Debug|x86.Build.0 = Debug|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Release|Any CPU.ActiveCfg = Release|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Release|Any CPU.Build.0 = Release|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Release|x64.ActiveCfg = Release|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Release|x64.Build.0 = Release|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Release|x86.ActiveCfg = Release|Any CPU
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9}.Release|x86.Build.0 = Release|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Debug|Any CPU.Build.0 = Debug|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Debug|x64.ActiveCfg = Debug|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Debug|x64.Build.0 = Debug|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Debug|x86.ActiveCfg = Debug|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Debug|x86.Build.0 = Debug|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Release|Any CPU.ActiveCfg = Release|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Release|Any CPU.Build.0 = Release|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Release|x64.ActiveCfg = Release|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Release|x64.Build.0 = Release|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Release|x86.ActiveCfg = Release|Any CPU
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62}.Release|x86.Build.0 = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
Expand Down Expand Up @@ -868,6 +952,12 @@ Global
{0FB125FE-5AB3-4667-8D1B-85A6284474ED} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{2E70AA19-0309-4C6F-83D2-8E3DD2A7EC89} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{60E4E82D-7BBF-4513-80ED-36A2273BB97D} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{FEE8EBAE-189B-4988-9C0D-12483838673A} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{A5D634E1-4AE9-4EA6-AD4B-E7FE81F52749} = {F213898F-1E32-48F1-AB8C-83D2BD01A93B}
{C6028374-2C99-4770-A8DE-58DEAD1DE854} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{97523284-1AB2-4BB1-84A2-818E103C6DE7} = {F213898F-1E32-48F1-AB8C-83D2BD01A93B}
{24F721D8-72A6-480B-AE86-CBBF6D5E7CA9} = {827E0CD3-B72D-47B6-A68D-7590B98EB39B}
{4A8FA12D-23AE-4CA8-A79F-7EC963958A62} = {F213898F-1E32-48F1-AB8C-83D2BD01A93B}
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {A5905A5D-B4E1-4A7A-9279-0283D86A9F7F}
Expand Down
184 changes: 184 additions & 0 deletions docs/distributed-runners-proposal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# Distributed GitHub Actions Runners — Proposal

## Problem

Run a C# application across multiple GitHub Actions runners concurrently using a matrix strategy, where one instance becomes the master and the others become workers. The master delegates work and orchestrates everything, parallelising tasks and speeding up execution.

## Core Challenge

GitHub-hosted runners are isolated, ephemeral VMs with no shared network. There is no built-in discovery mechanism or runner-to-runner communication.

## Recommended Approach: Upstash Redis as Coordination Layer

Use a free Upstash Redis instance as a message broker between runners. This removes the need for direct network connectivity, tunnels, or VPNs.

### Why Upstash Redis

- **Free tier**: 10,000 commands/day, 256MB storage — more than enough for CI coordination
- **Serverless**: no VM to manage, always on, no idle cost
- **REST API**: works without a Redis client library (just HTTP calls)
- **Standard Redis protocol**: also works with `StackExchange.Redis` if preferred
- Setup: sign up at upstash.com, create a database, store connection string as GitHub secret

### Architecture

```
┌────────────────────────────────────────────────┐
│ GitHub Actions Matrix │
│ │
│ Runner 0 (master) │
│ ├── Pushes work to Redis queue │
│ ├── Reads results from Redis │
│ └── Publishes completion signal │
│ │
│ Runner 1..N (workers) │
│ ├── Pop work from Redis queue │
│ ├── Execute work │
│ └── Push results back to Redis │
│ │
│ ▼ ▲ │
│ ┌─────────────────────┐ │
│ │ Upstash Redis │ │
│ │ (free tier) │ │
│ │ │ │
│ │ work:queue │ │
│ │ results:queue │ │
│ │ master:status │ │
│ └─────────────────────┘ │
└────────────────────────────────────────────────┘
```

### Benefits Over Tunnel-Based Approaches

- No tunnel setup or URL sharing needed
- Redis handles all coordination
- Workers and master don't need direct connectivity
- If a worker dies, its work item stays in the queue — another worker can pick it up
- Built-in pub/sub if real-time signaling is needed

### GitHub Actions Workflow

```yaml
jobs:
pipeline:
strategy:
matrix:
instance: [0, 1, 2, 3]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: '10.0.x'

- name: Run
env:
UPSTASH_REDIS_URL: ${{ secrets.UPSTASH_REDIS_URL }}
UPSTASH_REDIS_TOKEN: ${{ secrets.UPSTASH_REDIS_TOKEN }}
run: |
dotnet run --project src/YourApp -- \
--instance=${{ matrix.instance }} \
--total=4
```

### C# Implementation — Role Selection

```csharp
var instance = int.Parse(args["--instance"]);
if (instance == 0)
await RunAsMaster(totalWorkers);
else
await RunAsWorker();
```

### C# Implementation — Using StackExchange.Redis

```csharp
// Using StackExchange.Redis (works with Upstash)
var redis = ConnectionMultiplexer.Connect(connectionString);
var db = redis.GetDatabase();

// Master registers itself
await db.StringSetAsync("master:address", "ready");

// Master publishes work via a list
await db.ListRightPushAsync("work:queue", serializedWorkItem);

// Workers pop work
var work = await db.ListLeftPopAsync("work:queue");

// Workers push results back
await db.ListRightPushAsync("results:queue", serializedResult);
```

### C# Implementation — Using Upstash REST API (No Redis Client Needed)

```csharp
var http = new HttpClient();
http.DefaultRequestHeaders.Authorization =
new AuthenticationHeaderValue("Bearer", upstashToken);

// SET
await http.PostAsync($"{upstashUrl}/set/master:address/ready", null);

// GET
var response = await http.GetStringAsync($"{upstashUrl}/get/master:address");

// LPUSH (add work)
await http.PostAsync($"{upstashUrl}/lpush/work:queue/{serializedWorkItem}", null);

// RPOP (take work)
var work = await http.GetStringAsync($"{upstashUrl}/rpop/work:queue");
```

## Alternative Free Approaches

### 1. Tailscale Mesh VPN (Free for personal use)

- Free plan: up to 100 devices
- [tailscale/github-action](https://github.com/tailscale/github-action) sets it up in CI
- All runners join the same tailnet and get full IP connectivity
- Communicate via gRPC, HTTP, raw TCP, etc.
- Requires a Tailscale account + OAuth client

### 2. Cloudflare Quick Tunnel (Free, zero accounts needed)

- `cloudflared tunnel --url http://localhost:5000` gives a `*.trycloudflare.com` URL
- No Cloudflare account needed for quick tunnels
- Master creates tunnel, shares URL via GitHub Actions Cache or `gh variable`
- Workers connect to that URL
- Drawback: need to coordinate URL sharing between matrix jobs

### 3. GitHub Actions Cache as Coordination (Free, built-in)

- 10 GB free per repo
- Master writes address/work to cache keys, workers poll for them
- High latency (~2-5s per operation), but workable for coarse-grained coordination
- No external accounts needed at all

### 4. Other Free Redis Hosts

| Provider | Free Tier | Notes |
|----------|-----------|-------|
| Redis Cloud | 30MB, 1 database | Standard Redis protocol |
| Aiven | Small instance | Valkey (Redis-compatible) |
| Railway | $5 credit/month | One-click Redis deploy |
| Render | 25MB | Expires after 90 days |

## Important Caveats

1. **Matrix jobs don't start simultaneously** — GitHub may queue some runners if capacity is tight. Workers need to handle waiting for the master, and the master needs to handle workers arriving at different times.

2. **Runner reliability** — GitHub-hosted runners can be preempted. Build in health checks, timeouts, and retry logic. Design work items to be idempotent so they can be safely re-queued.

3. **6-hour job timeout** — GitHub Actions jobs have a maximum runtime of 6 hours. Plan workloads accordingly.

4. **Secrets management** — Store Upstash credentials (or Tailscale OAuth tokens) as GitHub repository secrets. Never hardcode them.

## GitHub Secrets Needed

For the recommended Upstash approach, only two secrets:
- `UPSTASH_REDIS_URL` — the REST API URL from Upstash dashboard
- `UPSTASH_REDIS_TOKEN` — the REST API token from Upstash dashboard
Loading
Loading