Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dockerfile for CI runners #199

Merged
94 commits merged into from
Nov 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
e4d8041
wip
dagardner-nv Oct 7, 2022
4fc5b5a
wip
dagardner-nv Oct 7, 2022
5c01074
wip
dagardner-nv Oct 7, 2022
5d8ffdf
Set srf_root
dagardner-nv Oct 7, 2022
7b645fd
echo clang yml file
dagardner-nv Oct 7, 2022
4c4908d
Set PARALLEL_LEVEL
dagardner-nv Oct 7, 2022
f39b911
Add conda pkg
dagardner-nv Oct 7, 2022
e82565c
fix expr
dagardner-nv Oct 7, 2022
219aa2a
fix deps
dagardner-nv Oct 7, 2022
e23cb62
fix deps
dagardner-nv Oct 7, 2022
8daaae7
Only run the package stage on the main branches
dagardner-nv Oct 7, 2022
b8a0bd0
Benchmark doesn't need a lot of cores
dagardner-nv Oct 7, 2022
29da83e
Remove Jenkins pipeline
dagardner-nv Oct 7, 2022
084a5f2
Exclude any env var with 'TOKEN' in the name
dagardner-nv Oct 10, 2022
5e44234
Set the CONDA_TOKEN and upload results to conda-forge
dagardner-nv Oct 10, 2022
1e0e413
wip
dagardner-nv Oct 10, 2022
71d41e2
use rapids-logger
dagardner-nv Oct 10, 2022
4851dcf
Install libnuma
dagardner-nv Oct 10, 2022
957984a
move iwyu dir
dagardner-nv Oct 10, 2022
01122d8
wip
dagardner-nv Oct 10, 2022
63eafd7
wip
dagardner-nv Oct 10, 2022
4717248
Add logging
dagardner-nv Oct 11, 2022
9bff3b9
Dockerfile for runner
dagardner-nv Oct 11, 2022
396ec42
fix
dagardner-nv Oct 11, 2022
de8a3ad
fix
dagardner-nv Oct 11, 2022
cd04116
fix
dagardner-nv Oct 11, 2022
e74c942
fix
dagardner-nv Oct 11, 2022
48fdc05
fix
dagardner-nv Oct 11, 2022
c40efcf
wip
dagardner-nv Oct 11, 2022
4f167b6
wip
dagardner-nv Oct 11, 2022
f8e2ee6
Specify SRF_PYTHON_BUILD_STUBS var
dagardner-nv Oct 12, 2022
37524a5
wip
dagardner-nv Oct 12, 2022
00948ab
Fix tests so that the propper upstream build is used for the coverage…
dagardner-nv Oct 12, 2022
fe456e9
Default stub building to on
dagardner-nv Oct 12, 2022
8e8e5ce
wip
dagardner-nv Oct 12, 2022
4d5252e
Don't use a gpu for the conda stage
dagardner-nv Oct 13, 2022
1b3547c
Add libnvidia-compute-495 which provides libnvidia-ml1.so
dagardner-nv Oct 13, 2022
57930cb
Enable stubs
dagardner-nv Oct 13, 2022
11db714
Explicitly enable stubs
dagardner-nv Oct 13, 2022
53335cf
temp, please revert don't merge
dagardner-nv Oct 13, 2022
e13c3f4
Revert "temp, please revert don't merge"
dagardner-nv Oct 13, 2022
dd4d308
Ignore nvml errors if SRF_IGNORE_NO_GPU env var is set
dagardner-nv Oct 13, 2022
6e2657d
Don't log when SRF_IGNORE_NO_GPU is set:
dagardner-nv Oct 13, 2022
6362c6a
wip
dagardner-nv Oct 13, 2022
9e885d2
tested working
dagardner-nv Oct 13, 2022
41a950d
Install clang from apt instead of conda, trims 1GB off of the image size
dagardner-nv Oct 13, 2022
617e01d
Alpabatize apt pkgs, remove unneeded clang packages
dagardner-nv Oct 13, 2022
af69091
Image for CI runner
dagardner-nv Oct 19, 2022
c5ec4c5
Move dockerfile and add build instructions
dagardner-nv Oct 19, 2022
883c09e
Merge branch 'branch-22.11' into david-test-new-img
dagardner-nv Oct 20, 2022
4e76977
Benchmarks don't need the cache dir
dagardner-nv Oct 20, 2022
31aa7b3
Set the project name and the cuda version as args
dagardner-nv Oct 20, 2022
489b1dc
fix cr header
dagardner-nv Oct 20, 2022
bb706f5
remove hack
dagardner-nv Oct 20, 2022
d31ab9c
???
dagardner-nv Oct 20, 2022
643d479
make the clang version a docker arg
dagardner-nv Oct 20, 2022
de108bf
Install clang from Conda
dagardner-nv Oct 21, 2022
ba1fe88
Merge branch 'david-test-new-img' into david-ci-img
dagardner-nv Oct 31, 2022
1c9e268
Remove older iteration of ci img file, and set credentials for ngc
dagardner-nv Oct 31, 2022
2b1a934
Fix bad copy/paste error
dagardner-nv Oct 31, 2022
3f8de8a
Fix token
dagardner-nv Oct 31, 2022
cdb2167
Log early
dagardner-nv Oct 31, 2022
6763476
??
dagardner-nv Oct 31, 2022
b1f7374
Log some version info before failing
dagardner-nv Oct 31, 2022
733618b
Try removing NVIDIA_REQUIRE_CUDA var
dagardner-nv Oct 31, 2022
38f9cd7
Print envs first
dagardner-nv Nov 1, 2022
e55827c
revert wip
dagardner-nv Nov 1, 2022
d26e564
Temp disable build and check, run test with a 450 driver
dagardner-nv Nov 1, 2022
3296654
Only run the test and use the base image
dagardner-nv Nov 1, 2022
fc379cb
Fix syntax
dagardner-nv Nov 1, 2022
85df943
Fix image name
dagardner-nv Nov 1, 2022
cb78189
Revert "Fix image name"
dagardner-nv Nov 1, 2022
2896021
Revert "Fix syntax"
dagardner-nv Nov 1, 2022
f27d142
Revert "Only run the test and use the base image"
dagardner-nv Nov 1, 2022
e76b927
Revert "Temp disable build and check, run test with a 450 driver"
dagardner-nv Nov 1, 2022
91f5b53
Optionally don't install libnvidia-compute-495
dagardner-nv Nov 1, 2022
7d7ecca
Switch to new image builds
dagardner-nv Nov 1, 2022
a9d91d7
Remove old work-around hacks
dagardner-nv Nov 1, 2022
a3c7e4d
Switch to using rapids-mamba-retry as a wrapper for mamba
dagardner-nv Nov 1, 2022
c0037be
Switch to a multi-stage dcker build
dagardner-nv Nov 1, 2022
d5a8808
Copy conda files using a wild-card
dagardner-nv Nov 2, 2022
fde171f
Separate ci pipeline into a re-usable workflow, add build/push script…
dagardner-nv Nov 2, 2022
8ffb9a5
Merge branch 'branch-22.11' into david-ci-img
dagardner-nv Nov 2, 2022
6332aa6
Remove run name
dagardner-nv Nov 2, 2022
308921e
Remove old comment
dagardner-nv Nov 2, 2022
f431864
Fix merge error:
dagardner-nv Nov 2, 2022
47ed8ff
Restore ci env file
dagardner-nv Nov 3, 2022
c907de0
Add codecov tool to ci image
dagardner-nv Nov 3, 2022
1092213
Update image version
dagardner-nv Nov 3, 2022
169f9f9
Fix path to codecov tool
dagardner-nv Nov 3, 2022
6bde01c
Add DOCKER_EXTRA_ARGS var
dagardner-nv Nov 3, 2022
f53a5f3
Remove out of date commen, and remove nvrpc from exclude list
dagardner-nv Nov 3, 2022
7997e3c
Revert "Remove out of date commen, and remove nvrpc from exclude list"
dagardner-nv Nov 3, 2022
8459f2d
Remove out of date comment
dagardner-nv Nov 3, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
230 changes: 230 additions & 0 deletions .github/workflows/ci_pipe.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
# SPDX-FileCopyrightText: Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name: CI Pipeline
run-name: CI Pipeline

on:
workflow_call:
inputs:
aws_region:
default: 'us-west-2'
type: string
run_check:
required: true
type: boolean
run_package_conda:
required: true
type: boolean
container:
required: true
type: string
test_container:
required: true
type: string
secrets:
CODECOV_TOKEN:
required: true
CONDA_TOKEN:
required: true
GHA_AWS_ACCESS_KEY_ID:
required: true
GHA_AWS_SECRET_ACCESS_KEY:
required: true
NGC_API_KEY:
required: true

env:
AWS_DEFAULT_REGION: ${{ inputs.aws_region }}
AWS_ACCESS_KEY_ID: "${{ secrets.GHA_AWS_ACCESS_KEY_ID }}"
AWS_SECRET_ACCESS_KEY: "${{ secrets.GHA_AWS_SECRET_ACCESS_KEY }}"
BUILD_CC: "gcc"
CHANGE_TARGET: "${{ github.base_ref }}"
GH_TOKEN: "${{ github.token }}"
GIT_COMMIT: "${{ github.sha }}"
SRF_ROOT: "${{ github.workspace }}/srf"
WORKSPACE: "${{ github.workspace }}/srf"
WORKSPACE_TMP: "${{ github.workspace }}/tmp"


jobs:
check:
if: ${{ inputs.run_check }}
name: Check
runs-on: [self-hosted, linux, amd64, cpu4]
timeout-minutes: 60
container:
credentials:
username: '$oauthtoken'
password: ${{ secrets.NGC_API_KEY }}
image: ${{ inputs.container }}
strategy:
fail-fast: true

steps:
- name: Checkout
uses: actions/checkout@v3
with:
lfs: false
path: 'srf'
fetch-depth: 0

- name: Check
shell: bash
run: ./srf/ci/scripts/github/checks.sh

build:
name: Build
runs-on: [self-hosted, linux, amd64, cpu16]
timeout-minutes: 60
container:
credentials:
username: '$oauthtoken'
password: ${{ secrets.NGC_API_KEY }}
image: ${{ inputs.container }}
strategy:
fail-fast: true
matrix:
build_cc: ["gcc", "gcc-coverage", "clang"]

steps:
- name: Checkout
uses: actions/checkout@v3
with:
lfs: false
path: 'srf'

- name: Build:linux:x86_64
shell: bash
env:
BUILD_CC: ${{ matrix.build_cc }}
run: ./srf/ci/scripts/github/build.sh

test:
name: Test
needs: [build]
runs-on: [self-hosted, linux, amd64, gpu-v100-495-1]
timeout-minutes: 60
container:
credentials:
username: '$oauthtoken'
password: ${{ secrets.NGC_API_KEY }}
image: ${{ inputs.test_container }}
options: --cap-add=sys_nice
env:
NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }}
PARALLEL_LEVEL: '10'
strategy:
fail-fast: true
matrix:
build_cc: ["gcc", "gcc-coverage"]

steps:
- name: Checkout
uses: actions/checkout@v3
with:
lfs: false
path: 'srf'

- name: Test:linux:x86_64
shell: bash
env:
BUILD_CC: ${{ matrix.build_cc }}
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
run: ./srf/ci/scripts/github/test.sh

documentation:
name: Documentation
needs: [build]
runs-on: [self-hosted, linux, amd64, cpu4]
timeout-minutes: 60
container:
credentials:
username: '$oauthtoken'
password: ${{ secrets.NGC_API_KEY }}
image: ${{ inputs.container }}
strategy:
fail-fast: true

steps:
- name: Checkout
uses: actions/checkout@v3
with:
lfs: false
path: 'srf'

- name: build_docs
shell: bash
run: ./srf/ci/scripts/github/docs.sh

benchmark:
name: Benchmark
needs: [build]
runs-on: [self-hosted, linux, amd64, cpu4]
timeout-minutes: 60
container:
credentials:
username: '$oauthtoken'
password: ${{ secrets.NGC_API_KEY }}
image: ${{ inputs.container }}
options: --cap-add=sys_nice
strategy:
fail-fast: true

steps:
- name: Checkout
uses: actions/checkout@v3
with:
lfs: false
path: 'srf'

- name: pre_benchmark
shell: bash
run: ./srf/ci/scripts/github/pre_benchmark.sh
- name: benchmark
shell: bash
run: ./srf/ci/scripts/github/benchmark.sh
- name: post_benchmark
shell: bash
run: ./srf/ci/scripts/github/benchmark.sh


package:
name: Package
if: ${{ inputs.run_package_conda }}
needs: [benchmark, documentation, test]
runs-on: [self-hosted, linux, amd64, cpu16]
timeout-minutes: 60
container:
credentials:
username: '$oauthtoken'
password: ${{ secrets.NGC_API_KEY }}
image: ${{ inputs.container }}
strategy:
fail-fast: true

steps:
- name: Checkout
uses: actions/checkout@v3
with:
lfs: false
path: 'srf'
fetch-depth: 0

- name: conda
shell: bash
env:
CONDA_TOKEN: "${{ secrets.CONDA_TOKEN }}"
run: ./srf/ci/scripts/github/conda.sh
Loading