Skip to content

Conversation

@rodrigok
Copy link
Member

@rodrigok rodrigok commented Nov 10, 2025

https://rocketchat.atlassian.net/browse/ARCH-1854

Proposed changes (including videos or screenshots)

Issue(s)

Steps to test or reproduce

Further comments

Summary by CodeRabbit

  • Bug Fixes

    • Applied OpenSSL security patch (CVE-2025-9230) across images
  • Chores

    • Switched many services to multi-stage Docker builds for smaller, safer runtime images
    • Upgraded base images and improved multi-arch support and CI caching/matrix
    • Refined CI/CD build, publish and E2E orchestration, including coverage toggles and more targeted failure logs
  • New

    • Docker Compose now includes a local Mongo replica set, health checks, and an HTTP test service
  • Tests

    • Storybook test task can run independently of the main build to speed CI runs

@dionisio-bot
Copy link
Contributor

dionisio-bot bot commented Nov 10, 2025

Looks like this PR is not ready to merge, because of the following issues:

  • This PR is missing the 'stat: QA assured' label

Please fix the issues and try again

If you have any trouble, please check the PR guidelines

@changeset-bot
Copy link

changeset-bot bot commented Nov 10, 2025

⚠️ No Changeset found

Latest commit: 10d1c08

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 10, 2025

Walkthrough

Refactors CI to a per-service, multi-arch build/promotion flow; adds docker-compose mongo/httpbin and healthchecks; converts many service Dockerfiles to multi-stage builds with OpenSSL upgrades; updates action inputs/caching and removes test-storybook build dependency.

Changes

Cohort / File(s) Summary
Build action & inputs
.github/actions/build-docker/action.yml
Renamed action, added setup-docker input (default true), removed node-version and NPM_TOKEN, gated Docker setup on setup-docker, changed daemon debug to false, added SERVICE_SUFFIX for digest/dir paths.
Meteor & Playwright action caches
.github/actions/meteor-build/action.yml, .github/actions/setup-playwright/action.yml
Added ${{ runner.arch }} to cache keys and matrix; Playwright cache key made dynamic with ${{ runner.os }}-${{ runner.arch }}-1.52.0.
Core CI workflow
.github/workflows/ci.yml
Restructured build matrix for per-service rows and archs, added per-service artifact restore/unpack, multi-service image builds, digest-based per-service promotion and multi-arch manifests, removed services job output.
E2E CI workflow
.github/workflows/ci-test-e2e.yml
Default MongoDB versions reduced to ["8.2"], added coverage input driving coverage flows, switched to docker-compose for Mongo/httpbin, removed node/NPM_TOKEN passthrough to build action, aligned startup/coverage logic with new inputs.
docker-compose CI
docker-compose-ci.yml
Added mongo (replica set) and httpbin services, standardized MONGO_URL/TRANSPORTER envs, added Rocke tchat healthcheck and HEAP_USAGE_PERCENT, removed extra_hosts, updated downstream services to new defaults.
Multi-stage Dockerfiles
apps/meteor/.docker/Dockerfile.alpine, apps/meteor/ee/server/services/Dockerfile, ee/apps/*/Dockerfile (e.g., ee/apps/account-service/Dockerfile, ee/apps/authorization-service/Dockerfile, ee/apps/ddp-streamer/Dockerfile, ee/apps/omnichannel-transcript/Dockerfile, ee/apps/presence-service/Dockerfile, ee/apps/queue-worker/Dockerfile, ee/apps/stream-hub-service/Dockerfile)
Converted many service Dockerfiles to builder+runtime multi-stage builds, moved build deps to builder, kept minimal runtime (shadow), upgraded OpenSSL (CVE fix), updated base images to node:22.16.0-alpine3.21, set ownership/user, added explicit EXPOSE/CMD where applicable.
Build tooling config
turbo.json
Removed dependsOn: ["build"] from test-storybook task; task now only declares outputs.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant GH as GitHub Actions
    participant Matrix as Matrix Runner
    participant Restore as Artifact Restore
    participant DockerBuild as Build Docker Action
    participant Digest as Digest Writer
    participant Promote as Promotion Job

    GH->>Matrix: Trigger workflow (matrix: arch × row.services)
    Matrix->>Restore: restore packages-build artifact (per-run)
    Restore-->>Matrix: artifact available

    loop per matrix row
        Matrix->>DockerBuild: build images for each service in row (if inputs.setup-docker == true)
        DockerBuild->>Digest: write per-service digest (uses SERVICE_SUFFIX)
        Digest-->>Matrix: upload per-service digest dir
    end

    Matrix->>Promote: download per-service digests
    Promote->>Promote: assemble multi-arch manifests and tags per-service
    Promote->>GH: push promoted images
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

  • Spot-check areas:
    • .github/workflows/ci.yml — matrix restructuring, per-service loops and digest handling.
    • docker-compose-ci.yml — mongo replica-set init, healthchecks, env defaults.
    • Multi-stage Dockerfiles — verify build/runtime parity, file ownership, OpenSSL upgrade placement.
    • .github/actions/build-docker/action.yml — SERVICE_SUFFIX logic, conditional setup-docker gating.

Possibly related PRs

Suggested labels

stat: ready to merge, stat: QA assured

Suggested reviewers

  • sampaiodiego
  • ggazzo
  • tassoevan

Poem

🐇 I hopped through manifests, trimmed each chore,
Builder and runtime now clean to the core,
Digests stacked neatly, multi-arch delight,
Mongo and httpbin hum through the night,
OpenSSL patched — hop, code safe in sight.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'chore(ci): reduce number of parallel jobs and improve container builds' directly aligns with the major changeset: restructured CI workflows with multi-service/multi-arch matrices, Docker optimizations (multi-stage builds, OpenSSL upgrades), and streamlined container orchestration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/ci-improvements

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between dd64bc9 and 10d1c08.

📒 Files selected for processing (1)
  • .github/workflows/ci.yml (7 hunks)
🧰 Additional context used
🪛 actionlint (1.7.8)
.github/workflows/ci.yml

315-315: description is required in metadata of "Build Docker" action at "/home/jailuser/git/.github/actions/build-docker/action.yml"

(action)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: 📦 Build Packages
  • GitHub Check: CodeQL-Build
  • GitHub Check: CodeQL-Build
🔇 Additional comments (9)
.github/workflows/ci.yml (9)

71-72: Helpful guidance preserved for debugging cache behavior.

The commented-out line provides clear documentation for developers who need to disable caching between runs during debugging. This is a valuable addition.


246-250: Build matrix correctly restructured with dynamic type selection.

The type matrix now uses a dynamic value to select between 'coverage' (PR) and 'production' (release/develop), with an exclude rule that handles the filtering appropriately. The restructuring successfully consolidates matrix definitions.


281-297: Service grouping successfully reduces parallel jobs.

The service matrix now groups related services (3 services per slot, with 1 reserved) to build multiple images per job run. The include rules correctly add rocketchat coverage builds for both architectures. This directly addresses the PR objective to "reduce number of parallel jobs."


302-312: Artifact restore and unpack workflow is correct.

Packages artifact is downloaded and extracted to the workspace root as expected. The conditional extraction ensures consistency with cache logic.


314-327: First service image build step correctly initializes Docker environment.

Step condition and Docker setup flow are correct. setup-docker is implicitly true for first step (default).


328-356: Conditional image build steps for services 2–3 correctly skip redundant Docker setup.

Each step properly checks for service existence and skips setup-docker after the first invocation. Service slot conditionals are correct.


358-371: Service slot 4 (reserved for future use) correctly conditioned.

Step is properly guarded to skip if service[3] doesn't exist. The setup-docker: false optimization is maintained.


525-526: MongoDB version matrix expanded for improved compatibility testing.

The EE API tests now run against both MongoDB 5.0 and 8.2, with coverage collection designated for 8.2. This improves test coverage across supported MongoDB versions.


574-575: Coverage parameter correctly added to EE UI watcher test.

The coverage: '8.2' parameter aligns with the parallel API test configuration, ensuring consistent coverage tracking across EE test suites.

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)
  • CVE-2025: Request failed with status code 404

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rodrigok rodrigok force-pushed the chore/ci-improvements branch from c2a13fc to 919c824 Compare November 10, 2025 18:29
@rodrigok rodrigok marked this pull request as ready for review November 10, 2025 18:32
@rodrigok rodrigok requested a review from a team as a code owner November 10, 2025 18:32
@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.09%. Comparing base (7b176ff) to head (10d1c08).
⚠️ Report is 4 commits behind head on develop.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop   #37466      +/-   ##
===========================================
- Coverage    68.13%   68.09%   -0.04%     
===========================================
  Files         3364     3364              
  Lines       115753   115753              
  Branches     20914    20907       -7     
===========================================
- Hits         78866    78823      -43     
- Misses       34192    34231      +39     
- Partials      2695     2699       +4     
Flag Coverage Δ
e2e 57.41% <ø> (-0.06%) ⬇️
unit 72.13% <ø> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docker-compose-ci.yml (1)

20-34: MONGO_URL hardcoded correctly; depends_on for mongo incomplete.

Hardcoded MONGO_URL=mongodb://mongo:27017/rocketchat?replicaSet=rs0 is appropriate for docker-compose CI environment, and default values for MONGO_OPLOG_URL and TRANSPORTER prevent undefined env var issues. However, only rocketchat declares depends_on: [mongo], while services like authorization-service, account-service, presence-service, ddp-streamer, stream-hub-service, queue-worker, and omnichannel-transcript-service all have MONGO_URL environment variables but no depends_on: mongo. If these services connect to MongoDB at startup, missing the dependency declaration could cause race conditions and startup failures.

Add depends_on: [mongo] to all services with MONGO_URL environment variables to ensure MongoDB is ready before services attempt to connect. Verify which services actually require MongoDB at startup vs. lazy-load.

.github/workflows/ci-test-e2e.yml (1)

206-214: Critical: Undefined shell variable will mask test failures.

Line 214 uses exit $s, but variable $s is never set in this script. This causes the script to exit with an empty or unset value, masking test failures and producing misleading CI results.

Apply this fix to propagate the actual test exit code:

       run: |
         set -o xtrace

         npm run testapi

         docker compose -f ../../docker-compose-ci.yml stop

         ls -l $COVERAGE_DIR
-        exit $s
+        exit $?
🧹 Nitpick comments (6)
.github/actions/meteor-build/action.yml (1)

33-33: Optional: Consider standardizing cache action versions.

The build cache uses cache@v4 (line 33) while secondary caches use cache@v3 (lines 62, 71, 80). While this doesn't introduce functional issues, standardizing to v4 across all cache operations would leverage consistent performance and features, unless there's a specific reason to keep the secondary caches on v3.

Can you clarify whether the cache version split is intentional or should all caches be updated to cache@v4?

Also applies to: 62-62, 71-71, 80-80

ee/apps/omnichannel-transcript/Dockerfile (1)

1-130: Multi‑stage build pattern implemented cleanly.

The Dockerfile correctly separates build-time dependencies (builder) from runtime environment (final), reducing image size and attack surface. The OpenSSL security patch (CVE-2025-9230) and non-root user setup are appropriate.

Minor observation: NODE_ENV=production is set redundantly in both the builder stage (line 99) and final stage (lines 109–110). You can safely remove it from the builder since the final stage redefines it.

ee/apps/presence-service/Dockerfile (1)

1-118: Consistent multi‑stage implementation across services.

The pattern mirrors omnichannel-transcript/Dockerfile, which is good for consistency. Same minor note applies: NODE_ENV=production is redundantly defined in both builder (line 87) and final stage (lines 97–98).

.github/workflows/ci-test-e2e.yml (1)

250-255: Improved failure diagnostics with per-service logging.

Splitting server logs (line 251) to show individual services and separating MongoDB logs (line 255) makes debugging easier. Ensure this does not introduce excessive log output for already-verbose services.

.github/actions/build-docker/action.yml (1)

61-66: Conditional gating and daemon config are appropriate.

The gating on line 61 correctly skips Docker setup for downstream services when setup-docker: false is passed. The explicit "debug": false in the daemon config ensures consistent behavior.

Consider documenting why debug is explicitly set to false (e.g., "to reduce log verbosity in production builds").

.github/workflows/ci.yml (1)

71-72: Consider documenting the caching strategy in inline comments.

The commented option to include github.run_id in the source hash is a useful reference for disabling caching between runs. Consider adding a brief comment explaining the trade-off (e.g., "Caching is enabled by default to speed up repeated builds; uncomment to disable per-run caching for testing").

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7b176ff and 919c824.

📒 Files selected for processing (16)
  • .github/actions/build-docker/action.yml (4 hunks)
  • .github/actions/meteor-build/action.yml (2 hunks)
  • .github/actions/setup-playwright/action.yml (1 hunks)
  • .github/workflows/ci-test-e2e.yml (7 hunks)
  • .github/workflows/ci.yml (7 hunks)
  • apps/meteor/.docker/Dockerfile.alpine (1 hunks)
  • apps/meteor/ee/server/services/Dockerfile (3 hunks)
  • docker-compose-ci.yml (10 hunks)
  • ee/apps/account-service/Dockerfile (2 hunks)
  • ee/apps/authorization-service/Dockerfile (2 hunks)
  • ee/apps/ddp-streamer/Dockerfile (2 hunks)
  • ee/apps/omnichannel-transcript/Dockerfile (2 hunks)
  • ee/apps/presence-service/Dockerfile (2 hunks)
  • ee/apps/queue-worker/Dockerfile (2 hunks)
  • ee/apps/stream-hub-service/Dockerfile (2 hunks)
  • turbo.json (0 hunks)
💤 Files with no reviewable changes (1)
  • turbo.json
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-16T22:08:51.490Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-09-16T22:08:51.490Z
Learning: Applies to playwright.config.ts : Reference and align with settings defined in playwright.config.ts for global Playwright configuration

Applied to files:

  • .github/actions/setup-playwright/action.yml
🪛 actionlint (1.7.8)
.github/workflows/ci.yml

315-315: description is required in metadata of "Build Docker" action at "/home/jailuser/git/.github/actions/build-docker/action.yml"

(action)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: 🔨 Test Unit / Unit Tests
  • GitHub Check: 🔎 Code Check / Code Lint
  • GitHub Check: 🔎 Code Check / TypeScript
  • GitHub Check: 🔨 Test Storybook / Test Storybook
  • GitHub Check: 📦 Meteor Build (coverage)
  • GitHub Check: CodeQL-Build
🔇 Additional comments (23)
.github/actions/setup-playwright/action.yml (1)

14-14: Good alignment with multi-architecture CI support.

The dynamic cache key now includes runner OS and architecture, which prevents incompatible Playwright binaries from being used. This is a solid improvement that aligns with the PR's goal of supporting multi-architecture builds (ARM64/AMD64) and ensures cache hits remain architecture-safe.

.github/actions/meteor-build/action.yml (3)

32-43: Cache key structure and conditional optimization look good.

The multi-arch and multi-type aware cache key enables proper isolation of build artifacts across different environments, and the conditional skip on cache hit (line 41) is a solid optimization that avoids unnecessary setup overhead.


61-68: Vite cache architecture-aware scoping is appropriate.

Including runner.arch prevents stale artifacts from other architectures, and the restore-key fallback level is correctly scoped to the OS/arch combination without cross-arch bleeding.


70-86: Meteor cache scoping by type and architecture is well-designed.

Separating caches by both inputs.type (production/coverage) and runner.arch prevents mixing incompatible artifacts. This is particularly important for Meteor, where coverage builds may install different packages. The restore-key scoping at the type+arch level is appropriate and avoids false cache hits across build modes.

apps/meteor/.docker/Dockerfile.alpine (1)

21-23: Security patch: OpenSSL CVE-2025-9230 properly integrated.

The OpenSSL upgrade is correctly chained into the existing RUN instruction, preserving layer optimization and providing clear CVE documentation. The patch aligns with the broader security hardening across this PR.

apps/meteor/ee/server/services/Dockerfile (1)

1-31: Base image updates and security patches properly applied.

The Node.js bump (22.14.0 → 22.16.0) and Alpine minor version update (3.20 → 3.21) are low-risk maintenance changes. OpenSSL patch is correctly placed in the runtime stage pre-dependency installation, and the CVE reference is documented.

Also applies to: 76-78

ee/apps/queue-worker/Dockerfile (2)

1-103: Multi-stage build correctly separates build and runtime concerns.

Builder stage compiles dependencies and runs yarn workspaces focus --production, resulting in a minimal artifact set for the final stage. The extensive COPY chain ensures all transitive dependencies are available at build time.


105-130: Runtime stage properly isolates production dependencies and applies security patches.

OpenSSL upgrade (line 116) is correctly placed before workspace focus to ensure clean state. User/group setup, chown, and cleanup flow is sound. However, verify that this service's runtime actually needs MongoDB—if it does, add depends_on: [mongo] to docker-compose-ci.yml (currently missing for queue-worker; only rocketchat has it).

ee/apps/stream-hub-service/Dockerfile (1)

1-115: Multi-stage Dockerfile follows established pattern correctly.

Consistent with queue-worker and other services: builder stage for compilation, lean final stage for runtime. OpenSSL patch (lines 99-101) and user/ownership setup are correct. Consider adding depends_on: [mongo] in docker-compose-ci.yml if this service requires MongoDB at runtime.

ee/apps/authorization-service/Dockerfile (1)

1-114: Consistent multi-stage refactor aligns with other ee/apps services.

Pattern mirrors queue-worker, stream-hub-service, and ddp-streamer. No concerns with the Dockerfile structure itself.

ee/apps/ddp-streamer/Dockerfile (1)

1-120: Multi-stage pattern consistent; service-specific package list variations are appropriate.

ddp-streamer includes additional dependencies (password-policies, instance-status) beyond other services, reflecting its distinct dependency graph. OpenSSL patch and final-stage setup are correct.

ee/apps/account-service/Dockerfile (1)

1-117: Multi-stage refactor aligns with standardized pattern across ee/apps services.

All structural elements correct; package list includes service-specific dependencies (tools, etc.) as expected.

docker-compose-ci.yml (4)

41-46: Healthcheck well-configured for Rocket.Chat service.

Interval (2s), timeout (5s), and start_period (60s) are reasonable. The wget test against /livez endpoint is appropriate for application readiness.


207-228: MongoDB replica set initialization script requires verification for robustness.

The entrypoint bash script initializes a single-node replica set with mongod, a wait loop for readiness, and rs.initiate(). Verify:

  1. Idempotency: If the container restarts or the script is re-run, does rs.initiate() fail gracefully if the replica set already exists? The script has no error handling for the rs.initiate() call—a second run would fail silently.
  2. Escaping: The multiline bash -c heredoc with $$ variable escaping and mongosh --eval calls should be tested to ensure proper variable interpolation and command execution.
  3. Timing: The sleep 2 before the ping loop may be insufficient on slow systems; consider making it more robust with retry logic or a longer wait.

Consider adding error handling:

if mongosh --eval "rs.status()" &>/dev/null; then
  echo "ReplSet already initialized"
else
  mongosh --eval "rs.initiate(...)"
fi

Alternatively, use rs.initiate() with force: true if re-initialization is safe.


61-62: Environment variable defaults properly standardized across all services.

MONGO_URL and TRANSPORTER variables are now consistent (with appropriate defaults) across all downstream services (authorization-service, account-service, presence-service, ddp-streamer, stream-hub-service, queue-worker, omnichannel-transcript-service). However, address the depends_on: mongo gap identified in the previous comment to ensure startup ordering is correct.

Also applies to: 82-83, 104-105, 125-126, 152-153, 174-175, 196-197


230-231: httpbin service addition supports integration testing.

Kong httpbin is a useful tool for testing HTTP client behavior in integration tests. Minimal configuration is appropriate.

.github/workflows/ci-test-e2e.yml (2)

23-46: Coverage-aware matrix and MongoDB version reduction.

The reduction to a single default MongoDB version (['8.2']) aligns with the PR objective to reduce parallel jobs. The new coverage input provides flexibility for selecting coverage-instrumented builds on specific MongoDB versions.


77-80: Coverage suffix logic gates on both event type and branch.

The suffix is applied only when the coverage input matches the current MongoDB version AND the event is a release or the branch is develop. This allows coverage images to coexist with production images in the registry without overwriting.

Verify that the interaction between DOCKER_TAG_SUFFIX_ROCKETCHAT here and the corresponding logic in .github/actions/build-docker/action.yml (line 104) and .github/workflows/ci.yml (line 319 et al.) correctly gates coverage image builds across all stages.

.github/actions/build-docker/action.yml (1)

24-31: Conditional Docker setup optimizes multi‑service builds.

The new setup-docker input (default true) allows the first service to set up buildx, while subsequent services in the same job skip this step. The explicit default for type ('production') improves clarity. This optimization reduces CI job duration when building multiple services per matrix entry.

.github/workflows/ci.yml (4)

246-250: Matrix exclusion reduces PR builds to coverage only.

The exclude rule skips 'production' type builds on pull requests, reducing CI load while the include rows (lines 291–297) ensure both architectures always receive coverage builds. This strategy effectively reduces parallel jobs for PRs as stated in the PR objective.


314-369: Conditional service build steps with selective Docker setup.

Each of the four steps conditionally runs based on whether the corresponding array index exists (matrix.service[N]), and the first step sets up Docker while subsequent steps skip it (setup-docker: false). This is a performant approach but requires careful maintenance of array indexing.

Ensure the conditional logic remains in sync with service grouping—a misaligned index could silently skip a service or attempt to build a non-existent array element.


281-297: Index-out-of-bounds concerns addressed—design is sound.

Verification of the service matrix grouping and step indexing confirms that all array accesses are properly protected:

  • Index [0]: All matrix entries contain this element; accessed unguarded and safely.
  • Indices [1], [2], [3]: Each is accessed only when the corresponding element exists, guarded by if: matrix.service[N] conditions. Step names use fallback syntax (|| '"skipped"') for undefined indices.

Group 2 ([rocketchat]) contains only one service, so indices [1]–[3] don't exist in that matrix entry. However, the conditional guards and fallback syntax prevent any errors or unexpected behavior. The implementation is defensive and robust.

The design, while complex, poses no alignment or bounds-checking risk.


710-794: All concerns verified—no issues found.

  1. skopeo availability: Confirmed pre-installed on ubuntu-24.04 (v1.13.3+); commented-out installation at line 725 is appropriate.

  2. docker-compose-ci.yml services: All services referenced in the build matrix are defined in docker-compose-ci.yml (rocketchat, authorization-service, account-service, presence-service, ddp-streamer-service, stream-hub-service, queue-worker-service, omnichannel-transcript-service).

  3. Digest artifact structure: Implementation is correct. The build-docker action creates /tmp/digests/{service}/{arch}/digest.txt, uploads as digests-{service}-{arch}-{type}, which the loop correctly consumes via the merge-multiple pattern and queries docker-compose-ci.yml using the service basename.

@rodrigok rodrigok force-pushed the chore/ci-improvements branch from 919c824 to dd64bc9 Compare November 10, 2025 20:18
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
.github/workflows/ci.yml (1)

710-794: Uncomment the skopeo installation commands to fix docker-image-publish failure.

The docker-image-publish job will fail at lines 780 and 788 because skopeo is not installed on ubuntu-24.04 and the installation commands are commented out (lines 724–725). Uncomment these lines to install skopeo before the Publish Docker images step.

The digest artifact naming is correct—build-docker produces digests-${{ inputs.service }}-${{ inputs.arch }}-${{ inputs.type }} artifacts, which match the digests-* pattern used at line 714.

.github/workflows/ci-test-e2e.yml (2)

198-214: Critical: Undefined variable in test exit logic will mask test failures.

Line 214 uses exit $s, but the variable $s is never initialized or assigned. This causes the step to always exit with status 0 (success) regardless of test outcome, masking failures. The intent is to capture and propagate the exit code from npm run testapi.

  - name: E2E Test API
    if: inputs.type == 'api'
    working-directory: ./apps/meteor
    env:
      WEBHOOK_TEST_URL: 'http://httpbin'
      IS_EE: ${{ inputs.release == 'ee' && 'true' || '' }}
      COVERAGE_DIR: '/tmp/coverage'
      COVERAGE_REPORTER: 'lcovonly'
    run: |
      set -o xtrace

      npm run testapi
-     
+     s=$?
      docker compose -f ../../docker-compose-ci.yml stop

      ls -l $COVERAGE_DIR
      exit $s

Alternatively, use set -e to fail on first error:

  run: |
    set -o xtrace
+   set -e

    npm run testapi
    docker compose -f ../../docker-compose-ci.yml stop
    ls -l $COVERAGE_DIR

257-293: Update codecov action to latest version for GitHub Actions compatibility.

The latest release of codecov/codecov-action is v5.5.1 (released September 4, 2025), while the workflow currently uses v3. Update both instances at lines 263 and 271 from codecov/codecov-action@v3 to codecov/codecov-action@v5.5.1.

♻️ Duplicate comments (1)
.github/workflows/ci.yml (1)

319-319: Duplicate: Fix operator precedence in coverage suffix logic.

This issue was flagged in a previous review. Line 319 uses incorrect operator precedence; fix it to match the simpler, correct logic used on lines 332, 346, and 360:

-          DOCKER_TAG_SUFFIX_ROCKETCHAT: ${{ (matrix.type == 'coverage' && github.event_name == 'release' || github.ref == 'refs/heads/develop') && '-cov' || '' }}
+          DOCKER_TAG_SUFFIX_ROCKETCHAT: ${{ matrix.type == 'coverage' && '-cov' || '' }}
🧹 Nitpick comments (2)
.github/workflows/ci.yml (1)

356-368: Consider removing unused service index check or expand service matrix.

The workflow defines only 3 service groups (indices 0–2, lines 282–286), but the step at line 356–368 checks matrix.service[3], which will always be undefined. This step will skip harmlessly, but it wastes action evaluation time.

Either remove this step or confirm that a 4th service group should be added to the matrix.

If this step is intentional as a placeholder, consider adding a clarifying comment. If it's leftover, remove it:

-      - name: Image ${{ matrix.service[3] || '"skipped"' }}
-        uses: ./.github/actions/build-docker
-        if: matrix.service[3] && (github.event.pull_request.head.repo.full_name == github.repository || github.event_name == 'release' || github.ref == 'refs/heads/develop') && github.actor != 'dependabot[bot]'
-        env:
-          DOCKER_TAG_SUFFIX_ROCKETCHAT: ${{ matrix.type == 'coverage' && '-cov' || '' }}
-        with:
-          CR_USER: ${{ secrets.CR_USER }}
-          CR_PAT: ${{ secrets.CR_PAT }}
-          deno-version: ${{ needs.release-versions.outputs.deno-version }}
-          arch: ${{ matrix.arch }}
-          service: ${{ matrix.service[3] }}
-          type: ${{ matrix.type }}
-          setup-docker: false
.github/workflows/ci-test-e2e.yml (1)

153-156: Consider adding --wait flag for httpbin startup consistency.

The httpbin container is started without the --wait flag used elsewhere in the workflow (lines 168, 179). Adding --wait ensures httpbin is fully ready before proceeding to the API tests, improving reliability.

  - name: Start httpbin container and wait for it to be ready
    if: inputs.type == 'api'
    run: |
-     docker compose -f docker-compose-ci.yml up -d httpbin
+     docker compose -f docker-compose-ci.yml up -d httpbin --wait
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 919c824 and dd64bc9.

📒 Files selected for processing (5)
  • .github/actions/build-docker/action.yml (4 hunks)
  • .github/actions/meteor-build/action.yml (2 hunks)
  • .github/actions/setup-playwright/action.yml (1 hunks)
  • .github/workflows/ci-test-e2e.yml (8 hunks)
  • .github/workflows/ci.yml (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • .github/actions/build-docker/action.yml
  • .github/actions/meteor-build/action.yml
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-16T22:08:51.490Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-09-16T22:08:51.490Z
Learning: Applies to playwright.config.ts : Reference and align with settings defined in playwright.config.ts for global Playwright configuration

Applied to files:

  • .github/actions/setup-playwright/action.yml
📚 Learning: 2025-09-16T22:08:51.490Z
Learnt from: CR
Repo: RocketChat/Rocket.Chat PR: 0
File: .cursor/rules/playwright.mdc:0-0
Timestamp: 2025-09-16T22:08:51.490Z
Learning: Applies to apps/meteor/tests/e2e/**/*.spec.ts : Utilize Playwright fixtures (test, page, expect) consistently

Applied to files:

  • .github/actions/setup-playwright/action.yml
🪛 actionlint (1.7.8)
.github/workflows/ci.yml

315-315: description is required in metadata of "Build Docker" action at "/home/jailuser/git/.github/actions/build-docker/action.yml"

(action)

.github/workflows/ci-test-e2e.yml

262-262: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)


270-270: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: 📦 Build Packages
  • GitHub Check: CodeQL-Build
  • GitHub Check: CodeQL-Build
🔇 Additional comments (12)
.github/actions/setup-playwright/action.yml (1)

14-14: Correct multi-arch cache separation for platform-specific binaries.

The updated cache key now includes ${{ runner.os }} and ${{ runner.arch }}, ensuring separate cache entries per platform. This is the correct approach since Playwright binaries are platform-specific and prevents cache mismatches across heterogeneous runner environments (e.g., Linux ARM64 vs x86-64).

.github/workflows/ci.yml (5)

302-311: LGTM: Clean packages artifact restoration.

The restore-and-unpack approach for pre-built packages is standard and correctly implemented. The artifact is downloaded and extracted into the working directory, ready for per-service image builds.


281-297: LGTM: Well-structured multi-arch, per-service matrix.

The matrix redesign consolidates multiple architectures and service groups into a single, manageable structure. The exclude/include rules effectively control when builds run (e.g., production only on release/develop, coverage always for rocketchat). This reduces code duplication and improves clarity.


522-523: Verify downstream workflow accepts mongodb-version array and coverage parameter.

The new EE test configurations pass mongodb-version as a JSON array string and introduce a coverage parameter. Ensure that the called ci-test-e2e.yml workflow is updated to accept and properly parse these inputs.

Please confirm that .github/workflows/ci-test-e2e.yml has been updated to accept the mongodb-version array format and the new coverage parameter. If not included in this PR, verify that these changes are in a companion commit.


315-315: Ensure build-docker action has description metadata.

The static analysis tool flags that .github/actions/build-docker/action.yml is missing a description field. While this doesn't affect ci.yml execution, adding a description improves discoverability and aligns with GitHub Actions best practices.

Add a description to the action metadata file (not shown in this review) or verify that it's present if this was updated in a related commit.


71-72: LGTM: SOURCE_HASH caching strategy well-documented.

The commented alternative for including the run ID in SOURCE_HASH is a useful reference for future cache-invalidation scenarios. The current approach (file-hash-based, with -v3 suffix) is appropriate for reusable cross-run caching.

.github/workflows/ci-test-e2e.yml (6)

44-46: Coverage parameter appropriately added for conditional coverage logic.


77-80: Coverage suffix and MongoDB version env variables correctly gated.


158-162: Coverage directory preparation correctly gated and permissioned.


164-179: Container orchestration via docker-compose with health checks looks solid.


216-239: E2E UI test coverage flag correctly gated and conditionally set.


249-255: Comprehensive failure logging across services improves observability.

@ggazzo ggazzo added this to the 7.13.0 milestone Nov 11, 2025
@ggazzo ggazzo merged commit 4085cf7 into develop Nov 11, 2025
83 of 85 checks passed
@ggazzo ggazzo deleted the chore/ci-improvements branch November 11, 2025 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants