Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 30, 2025

Configure UFW Firewall - Architecture Refactoring

This PR implements UFW firewall configuration following the correct two-layer template architecture pattern, addressing all feedback from the initial code review.

✅ All Issues Fixed

Issue #1: Template Architecture Violation (CRITICAL) - FIXED

  • ✅ Created proper firewall_playbook wrapper with FirewallPlaybookContext and FirewallPlaybookTemplate
  • ✅ Created dedicated FirewallPlaybookTemplateRenderer following the inventory pattern
  • ✅ Removed generic render_tera_template() method
  • ✅ Type-safe SSH port validation at context construction
  • ✅ Consistent with inventory.yml.tera architecture

Issue #2: Incorrect Ansible Host Pattern (FUNCTIONAL BUG) - FIXED

  • ✅ Changed hosts: torrust_servershosts: all in template
  • ✅ Matches pattern used by all other playbooks

Issue #3: UFW Firewall Not Active - SHOULD BE FIXED

📐 Architecture Implementation

Two-Layer Template Pattern:

Layer 1: Template Wrapper (Type-Safe Context)

src/infrastructure/external_tools/ansible/template/wrappers/firewall_playbook/
├── mod.rs           # FirewallPlaybookTemplate wrapper
└── context.rs       # FirewallPlaybookContext with type-safe SSH port

Layer 2: Template Renderer (Orchestration)

src/infrastructure/external_tools/ansible/template/renderer/
├── mod.rs                    # Main renderer (updated)
└── firewall_playbook.rs      # FirewallPlaybookTemplateRenderer

🔧 Key Changes

  1. New Files Created:

    • src/infrastructure/external_tools/ansible/template/wrappers/firewall_playbook/mod.rs
    • src/infrastructure/external_tools/ansible/template/wrappers/firewall_playbook/context.rs
    • src/infrastructure/external_tools/ansible/template/renderer/firewall_playbook.rs
  2. Modified Files:

    • src/infrastructure/external_tools/ansible/template/wrappers/mod.rs - Export new wrapper
    • src/infrastructure/external_tools/ansible/template/renderer/mod.rs - Use dedicated renderer
    • src/infrastructure/external_tools/ansible/template/wrappers/inventory/context/mod.rs - Remove ssh_port field
    • templates/ansible/configure-firewall.yml.tera - Fix hosts pattern
  3. Removed:

    • Generic render_tera_template() method (architecture violation)
    • ssh_port field from InventoryContext (wrong abstraction)

✅ Quality Checks

  • Build: ✅ Compiles without warnings
  • Tests: ✅ All 1062 tests passing
  • Linters: ✅ All linters passing (markdown, yaml, toml, cspell, clippy, rustfmt, shellcheck)

🧪 Testing

Unit Tests Added:

  • FirewallPlaybookContext construction and validation
  • FirewallPlaybookTemplate rendering with SSH port substitution
  • FirewallPlaybookTemplateRenderer end-to-end rendering
  • Error handling for invalid SSH ports and template syntax

E2E Testing Required:
The firewall will need to be tested in E2E tests to confirm:

  • Playbook runs without "no hosts matched" warning
  • UFW is active: ufw status shows "Status: active"
  • SSH port 22 is allowed
  • SSH connection works after firewall enable
Original prompt

This section details on the original issue you should resolve

<issue_title>Configure UFW Firewall</issue_title>
<issue_description>Implement UFW firewall configuration in the ConfigureCommand with comprehensive SSH lockout prevention. This task adds firewall security while ensuring SSH access is preserved through careful configuration sequencing and port management.

This is the second phase of system security configuration, following the security updates implementation. It has higher risk due to potential SSH lockout scenarios.

Goals

  • UFW Firewall Active: Configure UFW with restrictive default policies
  • SSH Access Preserved: Maintain SSH connectivity throughout configuration
  • Configurable SSH Port: Support custom SSH ports from user configuration
  • New Domain Step: Add ConfigureFirewall to the ConfigureStep enum
  • Ansible Integration: Create Tera template for SSH port resolution
  • Safety First: Comprehensive SSH lockout prevention measures
  • Testing: Extensive E2E testing with SSH connectivity validation

Specifications

Safety-First Implementation

Critical Requirements:

  1. Allow SSH BEFORE enabling firewall - Never enable UFW before SSH rules are in place
  2. Use configured SSH port - Support user_inputs.ssh_port (defaults to 22)
  3. Dual SSH protection - Allow both port number and SSH service name
  4. Verify SSH access - Confirm SSH rules are active before completing

Domain Integration

Update ConfigureStep enum in src/domain/environment/state/configure_failed.rs:

/// Steps in the configure workflow  
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum ConfigureStep {
    /// Installing Docker
    InstallDocker,
    /// Installing Docker Compose  
    InstallDockerCompose,
    /// Configuring automatic security updates
    ConfigureSecurityUpdates,
    /// Configuring UFW firewall
    ConfigureFirewall,  // <- NEW
}

New Application Step

Create src/application/steps/system/configure_firewall.rs:

  • Implements ConfigureFirewallStep with SSH port variable resolution
  • Uses AnsibleClient with Tera template for dynamic SSH port
  • Comprehensive error handling for SSH lockout scenarios
  • Follows established patterns while handling template variables

New Ansible Template

Create templates/ansible/configure-firewall.yml.tera (Tera template for SSH port):

  • Reset UFW to clean state
  • Set restrictive default policies (deny incoming, allow outgoing)
  • CRITICALLY: Allow SSH on {{ ssh_port }} BEFORE enabling firewall
  • Allow SSH service by name as additional safety measure
  • Enable UFW only after SSH rules are confirmed
  • Verify SSH access is preserved

Implementation Approach

This is a higher risk implementation that:

  • Requires Tera template for SSH port variable resolution
  • Must prevent SSH lockout through careful sequencing
  • Needs comprehensive testing with SSH connectivity validation
  • Follows the template pattern similar to inventory.yml.tera

Acceptance Criteria

  • UFW Firewall Enabled: UFW is active with restrictive default policies
  • SSH Access Maintained: SSH connectivity preserved on configured port
  • Port Configuration: Uses user_inputs.ssh_port value correctly
  • Domain Integration: ConfigureFirewall step properly integrated
  • SSH Lockout Prevention: No scenario causes SSH access loss
  • Template Resolution: SSH port variable correctly resolved in Tera template
  • Error Handling: Clear, actionable error messages for firewall failures
  • Tests Pass: All existing tests continue to pass
  • E2E Validation: E2E tests confirm firewall is active AND SSH works
  • Safety Verification: SSH rules verified before firewall activation

Dependencies

Depends On: #17 (Configure Automatic Security Updates)
Parent Epic: #16 - Finish ConfigureCommand - System Security Configuration

Risk Assessment

Medium Risk due to:

  • Potential SSH lockout if improperly sequenced
  • Network-level changes affecting remote access
  • Tera template complexity for SSH port resolution

Mitigation Strategies:

  • Comprehensive E2E testing with SSH verification
  • Careful implementation sequencing (SSH rules before firewall activation)
  • Detailed error handling for SSH connectivity issues
  • Multiple safety checks and verification steps

Estimated Effort: 2-3 days

Full specification: UFW Firewall Documentation</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits October 30, 2025 17:02
- Add ConfigureFirewall variant to ConfigureStep enum
- Create configure-firewall.yml.tera template with SSH port variable
- Add ssh_port field to InventoryContext for template rendering
- Create ConfigureFirewallStep in application layer
- Add firewall template rendering to AnsibleTemplateRenderer
- Integrate firewall step into ConfigureCommandHandler
- Update module exports for new firewall step

Co-authored-by: josecelano <[email protected]>
- Fix doc markdown clippy warning in InventoryContext
- Apply cargo fmt formatting to all modified files

Co-authored-by: josecelano <[email protected]>
Copilot AI changed the title [WIP] Implement UFW firewall configuration with SSH safety Add UFW firewall configuration to ConfigureCommand with SSH lockout prevention Oct 30, 2025
Copilot AI requested a review from josecelano October 30, 2025 17:14
Copilot finished work on behalf of josecelano October 30, 2025 17:14
@josecelano josecelano marked this pull request as ready for review October 30, 2025 17:47
@josecelano
Copy link
Member

PR Review: Configure UFW Firewall Implementation

@copilot Thank you for working on this implementation! The PR successfully adds UFW firewall configuration, and all GitHub CI workflows are passing. However, after testing locally with the full E2E test suite, I've identified three critical issues that need to be addressed before merging.


🔍 Issue #1: Template Architecture Violation (Critical)

Problem

The implementation doesn't follow the project's established template architecture pattern. The current implementation adds a generic render_tera_template() method directly in AnsibleTemplateRenderer, which bypasses the type-safe wrapper and renderer pattern used throughout the codebase.

Current (Incorrect) Implementation

// In: src/infrastructure/external_tools/ansible/template/renderer/mod.rs
async fn render_tera_template(
    &self,
    template_name: &str,
    output_name: &str,
    context: &InventoryContext,  // ❌ Reusing InventoryContext for firewall template
    destination_dir: &Path,
) -> Result<(), ConfigurationTemplateError> {
    // Ad-hoc inline template rendering
    // No type safety, no dedicated context
}

Why This is Wrong

  1. No Type Safety: The firewall playbook uses InventoryContext which was designed for inventory files, not firewall configuration
  2. Breaks Single Responsibility: The AnsibleTemplateRenderer shouldn't handle specific template rendering logic
  3. Inconsistent with Existing Patterns: Violates the architecture established by inventory.yml.tera
  4. No Dedicated Validation: Template validation should happen at construction time in a dedicated wrapper

Correct Architecture Pattern

The project uses a two-layer template architecture for all Tera templates:

Layer 1: Template Wrapper (Type-Safe Context)

Reference Example: src/infrastructure/external_tools/ansible/template/wrappers/inventory/

src/infrastructure/external_tools/ansible/template/wrappers/
└── inventory/
    ├── mod.rs                      # InventoryTemplate wrapper
    └── context/
        ├── mod.rs                   # InventoryContext with validation
        ├── ansible_host.rs          # Type-safe IP address
        ├── ansible_port.rs          # Type-safe port number
        └── ssh_private_key_file.rs  # Type-safe file path

What the Wrapper Does:

  • Defines InventoryTemplate - wrapper around template content
  • Defines InventoryContext - type-safe context with validated fields
  • Validates template syntax at construction time
  • Provides render() method to write final file

Layer 2: Template Renderer (Orchestration)

Reference Example: src/infrastructure/external_tools/ansible/template/renderer/inventory.rs

pub struct InventoryTemplateRenderer {
    template_manager: Arc<TemplateManager>,
}

impl InventoryTemplateRenderer {
    const INVENTORY_TEMPLATE_FILE: &'static str = "inventory.yml.tera";
    const INVENTORY_OUTPUT_FILE: &'static str = "inventory.yml";

    pub fn render(
        &self,
        inventory_context: &InventoryContext,
        output_dir: &Path,
    ) -> Result<(), InventoryTemplateError> {
        // 1. Load template file
        // 2. Create InventoryTemplate with context
        // 3. Render to output file
    }
}

Required Changes for Firewall Template

You need to create the same architecture for the firewall playbook. Due to comment length limits, I'll provide the key structure here. Full implementation details are in the PR review file.

Required New Files:

  1. src/infrastructure/external_tools/ansible/template/wrappers/firewall_playbook/mod.rs - Template wrapper
  2. src/infrastructure/external_tools/ansible/template/wrappers/firewall_playbook/context.rs - Type-safe context with SshPort
  3. src/infrastructure/external_tools/ansible/template/renderer/firewall_playbook.rs - Dedicated renderer

Update: src/infrastructure/external_tools/ansible/template/renderer/mod.rs to use the new renderer

Why This Architecture Matters

  1. Type Safety: FirewallContext enforces valid SSH ports at compile time
  2. Single Responsibility: Each component has one clear purpose
  3. Testability: Wrappers and renderers can be tested independently
  4. Consistency: All Tera templates follow the same pattern
  5. Validation: Template validation happens at construction, catching errors early
  6. Maintainability: Clear separation of concerns

Documentation References

  • Architecture Overview: docs/technical/template-system-architecture.md
  • Template Guidelines: docs/contributing/templates.md
  • Existing Implementation: src/infrastructure/external_tools/ansible/template/wrappers/inventory/

🐛 Issue #2: Incorrect Ansible Host Pattern (Functional Bug)

Problem

The firewall playbook uses hosts: torrust_servers but should use hosts: all like other playbooks.

Current (Incorrect)

- name: Configure UFW firewall safely
  hosts: torrust_servers  # ❌ WRONG

Evidence

[WARNING]: Could not match supplied host pattern, ignoring: torrust_servers
skipping: no hosts matched

Correct Implementation

- name: Configure UFW firewall safely
  hosts: all  # ✅ CORRECT

Reference

All other playbooks use hosts: all:

  • configure-security-updates.ymlhosts: all
  • install-docker.ymlhosts: all
  • install-docker-compose.ymlhosts: all
  • configure-firewall.yml.terahosts: torrust_servers (wrong)

⚠️ Issue #3: UFW Firewall Not Active After Configuration

Problem

After E2E tests, the UFW firewall shows as inactive:

$ lxc exec torrust-tracker-vm-e2e-full -- sudo ufw status
Status: inactive

Expected

Status: active

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW       Anywhere

Root Cause

  1. Issue Scaffolding for main app #2: Playbook doesn't match hosts, so it never runs
  2. Potential Task Failure: UFW enable task might fail silently

Debugging Steps

  1. Run E2E tests: cargo run --bin e2e-tests-full -- --keep
  2. Connect to VM: lxc exec torrust-tracker-vm-e2e-full -- /bin/bash
  3. Check UFW: sudo ufw status verbose
  4. Test manually: cd build/e2e-full/ansible && ansible-playbook configure-firewall.yml -vvv

Potential Solutions

  1. Fix Issue Scaffolding for main app #2 first (change hosts pattern)
  2. Add explicit error handling
  3. Add verification that UFW is actually active
  4. Consider retry logic for robustness

Verification Checklist

  • Playbook runs without "no hosts matched" warning
  • UFW is active: ufw status shows "Status: active"
  • SSH port 22 is allowed
  • SSH connection works after firewall enable
  • E2E tests pass

📋 Summary of Required Changes

  1. Architecture Refactoring (Issue Roadmap #1):

    • Create firewall template wrapper and context
    • Create dedicated firewall renderer
    • Update AnsibleTemplateRenderer to use new renderer
    • Remove generic render_tera_template() method
  2. Playbook Host Pattern Fix (Issue Scaffolding for main app #2):

    • Change hosts: torrust_servers to hosts: all
  3. Firewall Activation Fix (Issue Setup logging for production CLI #3):

    • Add error handling and verification
    • Ensure UFW is actually enabled

🧪 Testing Approach

Since GitHub Actions cannot run LXD/virtualization:

  1. You (@copilot): Implement architectural refactoring and fixes
  2. Me (@josecelano): Run full E2E tests locally and verify UFW
  3. Iterate: Repeat until all issues resolved

📚 Resources

  • docs/technical/template-system-architecture.md
  • docs/contributing/templates.md
  • docs/contributing/error-handling.md

Please let me know if you need clarification!

@josecelano
Copy link
Member

Note on Template Architecture Complexity

I recognize that the template architecture described in Issue #1 above may seem overly complex for what is essentially just rendering a simple Ansible playbook with a single variable (ssh_port). The two-layer pattern with dedicated wrappers, contexts, and renderers does introduce significant boilerplate.

We're aware of this complexity, and it's exactly why we've created issue #19: Refactor Ansible Templates to Variables Pattern.

Planned Simplification

Issue #19 proposes refactoring the Ansible template system to use a centralized variables pattern (similar to OpenTofu's variables.tfvars.tera approach):

  • Single Tera template: variables.yml.tera containing all dynamic variables
  • Static playbooks: All playbooks (including configure-firewall.yml) become static files that reference the variables
  • Reduced complexity: No more per-template wrappers, contexts, and renderers

Why This Matters for This PR

For now, please follow the existing architecture pattern as described in Issue #1. This ensures:

  1. Consistency with the current codebase
  2. Type safety and validation
  3. No breaking changes to existing functionality

Once this PR is merged and issue #18 is closed, we'll tackle issue #19 to simplify the entire template system. The refactoring will eliminate much of the boilerplate while preserving functionality.

Timeline

Thank you for your patience with the current architecture! The simplification is coming soon.

…ttern

BREAKING CHANGE: Restructured firewall playbook template rendering

- Create proper two-layer template architecture for firewall playbook
- Add FirewallPlaybookContext with type-safe SSH port validation
- Add FirewallPlaybookTemplate wrapper for template validation
- Add FirewallPlaybookTemplateRenderer following inventory pattern
- Remove generic render_tera_template method from AnsibleTemplateRenderer
- Remove ssh_port field from InventoryContext (use dedicated context)
- Fix hosts pattern in configure-firewall.yml.tera (torrust_servers → all)

This change ensures consistency with the established template architecture
pattern used for inventory.yml.tera, providing better type safety, testability,
and maintainability.

Addresses all 3 issues from PR review:
- Issue #1: Template architecture violation (fixed)
- Issue #2: Incorrect Ansible host pattern (fixed)
- Issue #3: UFW not active (should be fixed by issue #2)

Co-authored-by: josecelano <[email protected]>
Copy link
Contributor Author

Copilot AI commented Oct 30, 2025

PR Review: Configure UFW Firewall Implementation

@copilot Thank you for working on this implementation! The PR successfully adds UFW firewall configuration, and all GitHub CI workflows are passing....

All 3 issues have been addressed in commit 91224b5:

Issue #1 (Template Architecture): Implemented proper two-layer architecture with FirewallPlaybookContext, FirewallPlaybookTemplate, and FirewallPlaybookTemplateRenderer following the established pattern. Removed the generic render_tera_template() method.

Issue #2 (Ansible Host Pattern): Fixed hosts: torrust_servershosts: all in configure-firewall.yml.tera.

Issue #3 (UFW Not Active): Should be resolved by fixing Issue #2. E2E testing will confirm.

The refactoring creates proper separation of concerns with:

  • Type-safe SSH port validation in FirewallPlaybookContext
  • Template validation at construction time in FirewallPlaybookTemplate
  • Dedicated renderer following the inventory pattern
  • Comprehensive unit tests for all components

All 1062 unit tests passing and all linters pass.

Copilot finished work on behalf of josecelano October 30, 2025 19:01
… VMs

The 'Allow SSH service by name' task was using a non-existent UFW
application profile 'ssh' which caused the playbook to fail before
enabling the firewall.

This task was redundant because SSH is already explicitly allowed
by port number in the previous task.

Fixes Issue #4 identified in PR review - UFW fails with 'Could not
find a profile matching ssh'
@josecelano
Copy link
Member

josecelano commented Oct 31, 2025

✅ All Issues Resolved - E2E Tests Pass Successfully

@copilot Great work on implementing the three original issues! I've tested your implementation locally with full E2E tests and can confirm:

  • Issue 1 (Architecture): Perfectly implemented with FirewallPlaybookTemplate, FirewallPlaybookContext, and FirewallPlaybookTemplateRenderer
  • Issue 2 (Host Pattern): Correctly fixed to hosts: all
  • Issue 3 (UFW Active): Now resolved - firewall is active with proper SSH access

However, during E2E testing, I discovered a fourth issue that was blocking the firewall from activating.


🐛 Issue 4: UFW Application Profile Not Found (Discovered During E2E Testing)

Problem

The playbook was failing with this error:

TASK [Allow SSH service by name (additional safety measure)]
fatal: [torrust-tracker-vm]: FAILED! => {
    "msg": "ERROR: Could not find a profile matching 'ssh'"
}

Root Cause: The task at lines 60-68 in configure-firewall.yml.tera attempted to allow SSH using a UFW application profile:

- name: Allow SSH service by name (additional safety measure)
  community.general.ufw:
    rule: allow
    name: ssh  # ❌ This UFW app profile doesn't exist in Ubuntu VMs

UFW application profiles are defined in /etc/ufw/applications.d/, but the ssh profile doesn't exist by default in our Ubuntu 24.04 LXD VMs. This caused the playbook to fail at task 7 of 12, before the firewall could be enabled.

Impact

  • Playbook execution stopped before reaching the "Enable UFW firewall" task
  • UFW remained in Status: inactive state
  • Issue Setup logging for production CLI #3 (UFW not active) was actually caused by this blocking error

✅ Fix Applied

Commit: d51b156 - fix: [#18] remove redundant UFW app profile task that fails on Ubuntu VMs

Changes: Removed the redundant task (lines 60-68) that was using the non-existent UFW application profile.

Rationale: This task was redundant because SSH is already explicitly allowed by port number in the previous task:

- name: Allow SSH access on configured port (BEFORE enabling firewall)
  community.general.ufw:
    rule: allow
    port: "{{ssh_port}}"
    proto: tcp
    comment: "SSH access (configured port {{ssh_port}})"

This approach is more robust because:

  1. It works on all Ubuntu versions regardless of UFW app profile availability
  2. It supports custom SSH ports (not just 22)
  3. It's explicit about which port is being allowed

🧪 Verification Results

E2E Tests

✅ All tests passed (62 seconds)

INFO e2e_tests_full: All tests passed and cleanup completed successfully
  test_suite: "e2e_tests"
  status: "success"

UFW Firewall Status

$ lxc exec torrust-tracker-vm-e2e-full -- sudo ufw status verbose
Status: active  # ✅ Now active!
Logging: on (low)
Default: deny (incoming), allow (outgoing), deny (routed)

To                         Action      From
--                         ------      ----
22/tcp                     ALLOW IN    Anywhere     # SSH access (configured port 22)
22/tcp (v6)                ALLOW IN    Anywhere (v6)

SSH Connectivity

$ lxc exec torrust-tracker-vm-e2e-full -- echo "SSH works!"
SSH works!  # ✅ Perfect!

📊 Final Status

Issue Status Description
1 Architecture RESOLVED Two-layer template pattern correctly implemented
2 Host Pattern RESOLVED Changed to hosts: all
3 UFW Inactive RESOLVED Firewall now active with proper SSH access
4 UFW App Profile RESOLVED Removed redundant task causing failure

All issues are now resolved. The implementation is ready for merge! 🎉


🔍 Testing Notes

These issues demonstrate why local E2E testing with actual LXD VMs is critical for infrastructure code. The UFW app profile issue would not have been caught by:

  • ❌ GitHub Actions (no LXD/virtualization support)
  • ❌ Unit tests (mocked Ansible execution)
  • ❌ Linters (valid YAML syntax)

Only full E2E testing with real Ubuntu VMs revealed the runtime environment issue.

Copilot finished work on behalf of josecelano October 31, 2025 07:38
- Remove iptables permission detection fallback from ConfigureFirewallStep
- Add TORRUST_TD_SKIP_FIREWALL_IN_CONTAINER env var to explicitly skip firewall step
- Accepts only "true" or "false" (case-sensitive, lowercase) for type safety
- E2E config tests automatically set the env var to skip firewall playbook
- Update documentation in templates/ansible/README.md

UFW/iptables requires kernel capabilities (CAP_NET_ADMIN, CAP_NET_RAW) not
available in unprivileged Docker containers. Container-based E2E tests now
explicitly skip the firewall configuration step while VM-based tests continue
to run it normally.
@josecelano
Copy link
Member

Testing Results and Fixes ✅

I've pulled your changes and completed comprehensive E2E testing. Found two issues that are now fixed and verified working.


🐛 Issues Found

Issue 1: Redundant UFW Task with Non-existent SSH App Profile

Problem: The playbook had a task attempting to allow SSH using UFW's app profile system:

- name: Allow SSH service by name
  community.general.ufw:
    rule: allow
    name: ssh

This failed because Ubuntu doesn't have a built-in ssh app profile in /etc/ufw/applications.d/.

Error:

fatal: [torrust_vm]: FAILED! => {"changed": false, "msg": "ERROR: Could not find a profile matching 'ssh'"}

Fix: Removed this redundant task. The playbook already has a task that correctly allows SSH by port number:

- name: Allow SSH access
  community.general.ufw:
    rule: allow
    port: '22'
    proto: tcp

This approach is more reliable and doesn't depend on app profiles.


Issue 2: Container-based E2E Tests Failing with iptables Permissions

Problem: After fixing the template, VM-based tests (e2e-full) passed successfully with UFW active. However, container-based tests (e2e-config) failed with:

fatal: [torrust_vm]: FAILED! => {"changed": false, "msg": "iptables v1.8.10 (legacy): can't initialize iptables table `filter': Permission denied (you must be root)\nPerhaps iptables or your kernel needs to be upgraded."}

Root Cause: UFW uses iptables internally, which requires kernel capabilities (CAP_NET_ADMIN, CAP_NET_RAW) that are unavailable in unprivileged Docker containers. This is a fundamental limitation of container environments.

Solution: Added environment variable guard to skip firewall configuration in container-based tests:

  1. Environment Variable: TORRUST_TD_SKIP_FIREWALL_IN_CONTAINER

    • Accepts only "true" or "false" (strict lowercase, type-safe)
    • Defaults to false (firewall runs by default)
  2. Implementation in src/application/command_handlers/configure/handler.rs:

let skip_firewall = std::env::var("TORRUST_TD_SKIP_FIREWALL_IN_CONTAINER")
    .map(|v| v == "true")
    .unwrap_or(false);

if skip_firewall {
    tracing::warn!("Skipping UFW firewall configuration due to TORRUST_TD_SKIP_FIREWALL_IN_CONTAINER");
} else {
    let firewall_step = ConfigureFirewallStep::new(Arc::clone(&ansible_client));
    firewall_step.execute(&tracker).await?;
}
  1. E2E Config Tests automatically set this variable:
std::env::set_var("TORRUST_TD_SKIP_FIREWALL_IN_CONTAINER", "true");

✅ Verification Results

VM-based Tests (e2e-full) - PASSED

  • Duration: ~85 seconds
  • Firewall configuration: Active (runs normally)
  • UFW enabled and SSH allowed successfully
  • All provisioning and configuration steps completed

Container-based Tests (e2e-config) - PASSED

  • Duration: ~49 seconds
  • Firewall configuration: Skipped (as expected)
  • Log output: "Skipping UFW firewall configuration due to TORRUST_TD_SKIP_FIREWALL_IN_CONTAINER"
  • All other configuration steps (Docker, Docker Compose, security updates) validated successfully

All Linters - PASSED

  • markdown, YAML, TOML, cspell, clippy, rustfmt, shellcheck

📝 Documentation Added

Updated templates/ansible/README.md with comprehensive firewall playbook documentation:

  • Purpose: Configure UFW firewall with strict defaults and SSH lockout prevention
  • Container Limitations: Explains why firewall configuration is skipped in containers
  • Environment Variable: Documents TORRUST_TD_SKIP_FIREWALL_IN_CONTAINER usage (accepts "true" or "false" only)
  • Execution Context: VM-based tests run firewall normally; container tests skip automatically

🎯 Summary

✅ Fixed redundant UFW task using non-existent app profile
✅ Implemented type-safe environment variable guard for container tests
✅ VM tests pass with firewall active
✅ Container tests pass with firewall skipped
✅ All linters pass
✅ Comprehensive documentation added

The firewall configuration is now production-ready with proper handling for both VM and container environments. Great work on the initial implementation! 🚀

Copy link
Member

@josecelano josecelano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK b8679fe

@josecelano josecelano merged commit dd567d0 into main Oct 31, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configure UFW Firewall

2 participants