Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions .github/workflows/tests-nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,13 @@ jobs:
num_gpus: 1
axolotl_extras:
nightly_build: "true"
- cuda: 126
cuda_version: 12.6.3
python_version: "3.11"
pytorch: 2.7.1
num_gpus: 1
axolotl_extras:
nightly_build: "true"
steps:
- name: Checkout
uses: actions/checkout@v4
Expand All @@ -130,3 +137,45 @@ jobs:
- name: Run tests job on Modal
run: |
modal run cicd.e2e_tests
docker-e2e-multigpu-tests:
if: github.repository_owner == 'axolotl-ai-cloud'
# this job needs to be run on self-hosted GPU runners...
runs-on: [self-hosted, modal]
timeout-minutes: 120
needs: [pre-commit, pytest, docker-e2e-tests]

strategy:
fail-fast: false
matrix:
include:
- cuda: 126
cuda_version: 12.6.3
python_version: "3.11"
pytorch: 2.7.1
num_gpus: 2
axolotl_extras:
nightly_build: "true"
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install Modal
run: |
python -m pip install --upgrade pip
pip install modal==1.0.2 jinja2
- name: Update env vars
run: |
echo "BASE_TAG=main-base-py${{ matrix.python_version }}-cu${{ matrix.cuda }}-${{ matrix.pytorch }}" >> $GITHUB_ENV
echo "PYTORCH_VERSION=${{ matrix.pytorch}}" >> $GITHUB_ENV
echo "AXOLOTL_ARGS=${{ matrix.axolotl_args}}" >> $GITHUB_ENV
echo "AXOLOTL_EXTRAS=${{ matrix.axolotl_extras}}" >> $GITHUB_ENV
echo "CUDA=${{ matrix.cuda }}" >> $GITHUB_ENV
echo "N_GPUS=${{ matrix.num_gpus }}" >> $GITHUB_ENV
echo "NIGHTLY_BUILD=${{ matrix.nightly_build }}" >> $GITHUB_ENV
Comment on lines +170 to +177

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

matrix.axolotl_args is never defined – workflow will fail at run-time
Both E2E jobs export:

echo "AXOLOTL_ARGS=${{ matrix.axolotl_args }}" >> $GITHUB_ENV

Yet none of the matrix.include dictionaries declare an axolotl_args key.
GitHub Actions treats an undefined matrix property as a hard error (Context does not contain property 'axolotl_args'), so the job will exit before tests start.

Minimal fix – add an empty value to every matrix entry:

           num_gpus: 1
           axolotl_extras:
+          axolotl_args: ""
           nightly_build: "true"

…and similarly for the multi-GPU row.

🧰 Tools
🪛 actionlint (1.7.7)

170-170: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; nightly_build: bool; num_gpus: number; python_version: number; pytorch: string}

(expression)

🤖 Prompt for AI Agents
In .github/workflows/tests-nightly.yml around lines 170 to 177, the matrix
property axolotl_args is used but not defined in any matrix.include entries,
causing a runtime error. To fix this, add an axolotl_args key with an empty
string value to every matrix.include dictionary where it is missing, ensuring
the property exists for all matrix configurations and prevents the workflow from
failing.

echo "CODECOV_TOKEN=${{ secrets.CODECOV_TOKEN }}" >> $GITHUB_ENV
- name: Run tests job on Modal
run: |
modal run cicd.multigpu
Loading