Add slim test suite for manus & Ralph features #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

OniReimu merged 11 commits into main from OniReimu/worcester

Jan 14, 2026

tests/claude-code/README.md

-Original file line number
+Diff line change
@@ Expand Up @@
     - Claude Code CLI installed and in PATH (`claude --version` should work)
     - Local superpowers plugin installed (see main README for installation)
+    - Optional: `timeout` command for integration tests (install with `brew install coreutils` on macOS)
+      - Integration tests will skip gracefully if timeout is not available
     ## Running Tests
@@ Expand All @@
     ./run-skill-tests.sh
     ```
-    ### Run integration tests (slow, 10-30 minutes):
+    ### Run integration tests (slow, 10-15 minutes):
     ```bash
     ./run-skill-tests.sh --integration
     ```
@@ Expand Down Expand Up / @@ -47,6 +49,11 @@ Common functions for skills testing: @@
     - `assert_not_contains output pattern name` - Verify pattern absent
     - `assert_count output pattern count name` - Verify exact count
     - `assert_order output pattern_a pattern_b name` - Verify order
+    - `assert_file_exists file name` - Verify file exists
+    - `assert_file_contains file pattern name` - Verify file contains pattern
+    - `assert_valid_json json name` - Verify valid JSON string
+    - `extract_ralph_status output` - Extract Ralph status block
+    - `verify_ralph_status_block status name` - Verify Ralph status format
     - `create_test_project` - Create temp test directory
     - `create_test_plan project_dir` - Create sample plan file
@@ Expand Down Expand Up / @@ -92,6 +99,20 @@ Tests skill content and requirements (~2 minutes): @@
     - Review loops documented
     - Task context provision documented
+    #### test-manus-pretool-hook.sh
+    Unit test for manus pretool hook (~1 second):
+    - Verifies hook outputs valid JSON when inactive
+    - Verifies hook outputs empty JSON when no .active marker
+    - Verifies hook emits reminder when .active exists
+    - Verifies reminder includes plan preview
+    #### test-ralph-status-blocks.sh
+    Unit test for Ralph status block parsing (~1 second):
+    - Verifies status block extraction from output
+    - Verifies all required fields present
+    - Verifies enum values are valid
+    - Verifies field format correctness
     ### Integration Tests (use --integration flag)
     #### test-subagent-driven-development-integration.sh
@@ Expand All / @@ -115,6 +136,44 @@ Full workflow execution test (~10-30 minutes): @@
     - Subagents follow the skill correctly
     - Final code is functional and tested
+    #### test-manus-resume-integration.sh
+    Manus planning session resume test (~4-6 minutes):
+    - Session 1: Starts manus-planning task, creates files
+    - Session 2: Resumes task in new session
+    - Verifies:
+      - Manus files created (task_plan.md, findings.md, progress.md)
+      - .active marker controls behavior
+      - Session resume works across invocations
+      - .active removed on completion
+    #### test-ralph-status-emission-integration.sh
+    Ralph status block emission test (~2-3 minutes):
+    - Creates Ralph project with simple task
+    - Executes task with Ralph-style prompt
+    - Verifies:
+      - Status block emitted at end
+      - All required fields present
+      - Field values are valid
+    #### test-manus-ralph-combined-integration.sh
+    Combined manus + Ralph workflow test (~2-3 minutes):
+    - Creates Ralph project
+    - Starts manus-planning in Ralph loop
+    - Verifies:
+      - Manus files created
+      - Status block emitted
+      - EXIT_SIGNAL stays false while manus active
+      - Both systems work together
+    ### Slim Test Suite
+    The new manus/Ralph tests form a slim test suite targeting ~10-15 minutes total runtime:
+    - 2 fast unit tests (< 1 minute total)
+    - 3 focused integration tests (10-12 minutes total)
+    - Tests core superpowers-ng differentiators:
+      - Manus-styled planning with session persistence
+      - Ralph loop integration with status blocks
     ## Adding New Tests
 . Create new test file: `test-<skill-name>.sh`
@@ Expand Down @@

tests/claude-code/run-skill-tests.sh

            
                      Original file line number
                      Diff line number
                      Diff line change
                  
    @@ -53,14 +53,19 @@ while [[ $# -gt 0 ]]; do
  
                echo "  --verbose, -v        Show verbose output"

                echo "  --test, -t NAME      Run only the specified test"

                echo "  --timeout SECONDS    Set timeout per test (default: 300)"

                echo "  --integration, -i    Run integration tests (slow, 10-30 min)"

                echo "  --integration, -i    Run integration tests (slow, 10-15 min)"

                echo "  --help, -h           Show this help"

                echo ""

                echo "Tests:"

                echo "  test-subagent-driven-development.sh  Test skill loading and requirements"

                echo "  test-subagent-driven-development.sh         Test skill loading and requirements"

                echo "  test-manus-pretool-hook.sh                  Test manus pretool hook unit"

                echo "  test-ralph-status-blocks.sh                 Test ralph status block parsing"

                echo ""

                echo "Integration Tests (use --integration):"

                echo "  test-subagent-driven-development-integration.sh  Full workflow execution"

                echo "  test-manus-resume-integration.sh                 Manus resume across sessions"

                echo "  test-ralph-status-emission-integration.sh        Ralph status block emission"

                echo "  test-manus-ralph-combined-integration.sh         Manus + Ralph combined"

                exit 0

                ;;

            *)

    @@ -74,11 +79,16 @@ done
  
    # List of skill tests to run (fast unit tests)

    tests=(

        "test-subagent-driven-development.sh"

        "test-manus-pretool-hook.sh"

        "test-ralph-status-blocks.sh"

    )

    # Integration tests (slow, full execution)

    integration_tests=(

        "test-subagent-driven-development-integration.sh"

        "test-manus-resume-integration.sh"

        "test-ralph-status-emission-integration.sh"

        "test-manus-ralph-combined-integration.sh"

    )

    # Add integration tests if requested

    @@ -117,8 +127,17 @@ for test in "${tests[@]}"; do
  
        start_time=$(date +%s)

        # Check if timeout command is available

        if command -v timeout >/dev/null 2>&1; then

            TIMEOUT_CMD="timeout $TIMEOUT"

        elif command -v gtimeout >/dev/null 2>&1; then

            TIMEOUT_CMD="gtimeout $TIMEOUT"

        else

            TIMEOUT_CMD=""

        fi

        if [ "$VERBOSE" = true ]; then

            if timeout "$TIMEOUT" bash "$test_path"; then

            if $TIMEOUT_CMD bash "$test_path"; then

                end_time=$(date +%s)

                duration=$((end_time - start_time))

                echo ""

    @@ -138,7 +157,7 @@ for test in "${tests[@]}"; do
  
            fi

        else

            # Capture output for non-verbose mode

            if output=$(timeout "$TIMEOUT" bash "$test_path" 2>&1); then

            if output=$($TIMEOUT_CMD bash "$test_path" 2>&1); then

                end_time=$(date +%s)

                duration=$((end_time - start_time))

                echo "  [PASS] (${duration}s)"

    @@ -173,7 +192,7 @@ echo "  Skipped: $skipped"
  
    echo ""

    if [ "$RUN_INTEGRATION" = false ] && [ ${#integration_tests[@]} -gt 0 ]; then

        echo "Note: Integration tests were not run (they take 10-30 minutes)."

        echo "Note: Integration tests were not run (they take 10-15 minutes)."

        echo "Use --integration flag to run full workflow execution tests."

        echo ""

    fi

tests/claude-code/test-helpers.sh

-Original file line number
+Diff line change
@@ Expand Up / @@ -191,12 +191,92 @@ EOF @@
         echo "$plan_file"
     }
+    # Check if a file exists
+    # Usage: assert_file_exists "/path" "test name"
+    assert_file_exists() {
+        local file="$1"
+        local test_name="${2:-test}"
+        if [ -f "$file" ]; then
+            echo "  [PASS] $test_name"
+            return 0
+        else
+            echo "  [FAIL] $test_name"
+            echo "  Missing file: $file"
+            return 1
+        fi
+    }
+    # Check if file contains a pattern
+    # Usage: assert_file_contains "/path" "pattern" "test name"
+    assert_file_contains() {
+        local file="$1"
+        local pattern="$2"
+        local test_name="${3:-test}"
+        if grep -q "$pattern" "$file"; then
+            echo "  [PASS] $test_name"
+            return 0
+        else
+            echo "  [FAIL] $test_name"
+            echo "  Expected to find: $pattern"
+            echo "  In file: $file"
+            return 1
+        fi
+    }
+    # Validate JSON string
+    # Usage: assert_valid_json "{...}" "test name"
+    assert_valid_json() {
+        local json="$1"
+        local test_name="${2:-test}"
+        if echo "$json" | python3 -c 'import json, sys; json.load(sys.stdin)' 2>/dev/null; then
+            echo "  [PASS] $test_name"
+            return 0
+        else
+            echo "  [FAIL] $test_name"
+            echo "  Invalid JSON"
+            return 1
+        fi
+    }
+    # Extract Ralph status block from output
+    # Usage: extract_ralph_status "output"
+    extract_ralph_status() {
+        echo "$1" | sed -n '/---RALPH_STATUS---/,/---END_RALPH_STATUS---/p'
+    }
+    # Verify Ralph status block fields and enums
+    # Usage: verify_ralph_status_block "status_block" "test name"
+    verify_ralph_status_block() {
+        local status="$1"
+        local test_name="${2:-test}"
+        echo "$status" | grep -q "STATUS: " || { echo "  [FAIL] $test_name (missing STATUS)"; return 1; }
+        echo "$status" | grep -q "TASKS_COMPLETED_THIS_LOOP: " || { echo "  [FAIL] $test_name (missing TASKS_COMPLETED_THIS_LOOP)"; return 1; }
+        echo "$status" | grep -q "FILES_MODIFIED: " || { echo "  [FAIL] $test_name (missing FILES_MODIFIED)"; return 1; }
+        echo "$status" | grep -q "TESTS_STATUS: " || { echo "  [FAIL] $test_name (missing TESTS_STATUS)"; return 1; }
+        echo "$status" | grep -q "WORK_TYPE: " || { echo "  [FAIL] $test_name (missing WORK_TYPE)"; return 1; }
+        echo "$status" | grep -q "EXIT_SIGNAL: " || { echo "  [FAIL] $test_name (missing EXIT_SIGNAL)"; return 1; }
+        echo "$status" | grep -q "RECOMMENDATION: " || { echo "  [FAIL] $test_name (missing RECOMMENDATION)"; return 1; }
+        echo "$status" | grep -Eq "STATUS: (IN_PROGRESS|COMPLETE|BLOCKED)" || { echo "  [FAIL] $test_name (bad STATUS)"; return 1; }
+        echo "$status" | grep -Eq "TESTS_STATUS: (PASSING|FAILING|NOT_RUN)" || { echo "  [FAIL] $test_name (bad TESTS_STATUS)"; return 1; }
+        echo "$status" | grep -Eq "WORK_TYPE: (IMPLEMENTATION|TESTING|DOCUMENTATION|REFACTORING)" || { echo "  [FAIL] $test_name (bad WORK_TYPE)"; return 1; }
+        echo "$status" | grep -Eq "EXIT_SIGNAL: (true|false)" || { echo "  [FAIL] $test_name (bad EXIT_SIGNAL)"; return 1; }
+        echo "  [PASS] $test_name"
+    }
     # Export functions for use in tests
     export -f run_claude
     export -f assert_contains
     export -f assert_not_contains
     export -f assert_count
     export -f assert_order
+    export -f assert_file_exists
+    export -f assert_file_contains
+    export -f assert_valid_json
+    export -f extract_ralph_status
+    export -f verify_ralph_status_block
     export -f create_test_project
     export -f cleanup_test_project
     export -f create_test_plan

tests/claude-code/test-manus-pretool-hook.sh

-Original file line number
+Diff line change
@@ -0,0 +1,32 @@
+    #!/usr/bin/env bash
+    set -euo pipefail
+    SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+    source "$SCRIPT_DIR/test-helpers.sh"
+    echo "=== Test: manus pretool hook ==="
+    TEST_PROJECT=$(create_test_project)
+    trap "cleanup_test_project $TEST_PROJECT" EXIT
+    # Case 1: No .active -> empty JSON
+    output=$(cd "$TEST_PROJECT" && "$SCRIPT_DIR/../../hooks/manus-pretool.sh")
+    assert_valid_json "$output" "Hook outputs valid JSON when inactive"
+    assert_contains "$output" "{}" "Hook outputs empty JSON when inactive"
+    # Case 2: .active + task_plan.md -> reminder JSON
+    mkdir -p "$TEST_PROJECT/docs/manus"
+    cat > "$TEST_PROJECT/docs/manus/task_plan.md" <<'PLAN'
+    # Task Plan
+    ## Goal
+    Test hook output.
+    PLAN
+    touch "$TEST_PROJECT/docs/manus/.active"
+    output_active=$(cd "$TEST_PROJECT" && "$SCRIPT_DIR/../../hooks/manus-pretool.sh")
+    assert_valid_json "$output_active" "Hook outputs valid JSON when active"
+    assert_contains "$output_active" "Manus Planning Reminder" "Hook emits reminder content"
+    echo "=== All tests passed ==="

tests/claude-code/test-manus-ralph-combined-integration.sh

-Original file line number
+Diff line change
@@ -0,0 +1,91 @@
+    #!/usr/bin/env bash
+    set -euo pipefail
+    SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+    source "$SCRIPT_DIR/test-helpers.sh"
+    echo "=== Integration Test: manus + ralph combined ==="
+    TEST_PROJECT=$(create_test_project)
+    trap "cleanup_test_project $TEST_PROJECT" EXIT
+    mkdir -p "$TEST_PROJECT/docs"
+    cd "$TEST_PROJECT"
+    git init --quiet
+    git config user.email "[email protected]"
+    git config user.name "Test User"
+    git commit --allow-empty -m "init" --quiet
+    # Pre-create manus files to simulate active manus planning
+    mkdir -p "$TEST_PROJECT/docs/manus"
+    cat > "$TEST_PROJECT/docs/manus/task_plan.md" <<'EOF'
+    # Test Task Plan
+    ## Goal
+    Create docs/combined.txt with "ok"
+    ## Current Phase
+    Phase 1 (in progress)
+    ## Phases
+    ### Phase 1: Initial Setup
+    **Status**: in_progress
+    EOF
+    cat > "$TEST_PROJECT/docs/manus/findings.md" <<'EOF'
+    # Findings
+    Simple test task
+    EOF
+    cat > "$TEST_PROJECT/docs/manus/progress.md" <<'EOF'
+    # Progress
+    Planning started
+    EOF
+    touch "$TEST_PROJECT/docs/manus/.active"
+    cat > "$TEST_PROJECT/@fix_plan.md" <<'EOF'
+    - [ ] Create docs/combined.txt with "ok"
+    EOF
+    cat > "$TEST_PROJECT/PROMPT.md" <<'EOF'
+    You are running in a Ralph loop with Superpowers-NG.
+    docs/manus/.active exists, which means manus planning is active.
+    Complete the simple task from @fix_plan.md.
+    At the end of your response, emit this status block format:
+    ---RALPH_STATUS---
+    STATUS: IN_PROGRESS
+    TASKS_COMPLETED_THIS_LOOP: 1
+    FILES_MODIFIED: 1
+    TESTS_STATUS: NOT_RUN
+    WORK_TYPE: IMPLEMENTATION
+    EXIT_SIGNAL: false
+    RECOMMENDATION: Task completed, manus still active, continue in next loop
+    ---END_RALPH_STATUS---
+    IMPORTANT: Keep EXIT_SIGNAL: false because docs/manus/.active exists
+    EOF
+    PROMPT="Change to directory $TEST_PROJECT and follow PROMPT.md exactly."
+    # Run with timeout fallback
+    if command -v timeout >/dev/null 2>&1; then
+        cd "$SCRIPT_DIR/../.." && timeout 180 claude -p "$PROMPT" --allowed-tools=all --add-dir "$TEST_PROJECT" --permission-mode bypassPermissions > "$TEST_PROJECT/out.txt" 2>&1 || true
+    elif command -v gtimeout >/dev/null 2>&1; then
+        cd "$SCRIPT_DIR/../.." && gtimeout 180 claude -p "$PROMPT" --allowed-tools=all --add-dir "$TEST_PROJECT" --permission-mode bypassPermissions > "$TEST_PROJECT/out.txt" 2>&1 || true
+    else
+        echo "  [SKIP] timeout command not available - install coreutils (brew install coreutils)"
+        exit 0
+    fi
+    assert_file_exists "$TEST_PROJECT/docs/manus/.active" "manus .active created"
+    status=$(extract_ralph_status "$(cat "$TEST_PROJECT/out.txt")")
+    verify_ralph_status_block "$status" "Status block emitted"
+    assert_contains "$status" "EXIT_SIGNAL: false" "EXIT_SIGNAL stays false while manus active"
+    echo "=== All tests passed ==="

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add slim test suite for manus & Ralph features #3

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!