-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[v1.8.x][BACKPORT]Stablizing CI and making binaries apache compliant #20015
Conversation
apache#19764) (apache#19930) * Enable CUDA 11.0 on nightly development builds (apache#19295) Remove CUDA 9.2 and CUDA 10.0 * [PIP] add build variant for cuda 11.2 (apache#19764) * adding ci docker files for cu111 and cu112 * removing previous CUDA make versions and adding support for cuda11.2 Co-authored-by: waytrue17 <[email protected]> Co-authored-by: Sheng Zha <[email protected]> Co-authored-by: Rohit Kumar Srivastava <[email protected]>
…eline (apache#19974) * migrating cd builds to ninja + removing static links to nvidia libs and leagacy cuda versions * installing NCCL manually for cuda11.2 container * set MSHADOW_USE_CUDNN=1 in CMakelists of mshadow to build properly for CUDNN support * adding coverage to cd requirements file to fix cu100, cu101 and cu102 tests * updating cd_test containers to ubuntu 18 * adding cmake config for linux native and adding USE_KV_STORE in linux_cpu * updating zmq builds to statically link to libmxnet.so * updating toolchains for r, clang and llvm for ubuntu18. OpenBlas Static link for 'distribution' build type only. Fix caffe build to use openCV 3. Remove leagacy Clang 3.9 from CI * fix versions for pip install in ubuntu_core_sh add new search path for cuDNN * finxing cudnn link problem for CUDA<=11.0 * adding library paths for libjpegturbo and lapack to fix failing CI on ubuntu 18 images * removing ASAN integration test from miscellaneous CI as its not required * fix lapack path for gpu builds * correctly installing libjpegturbo for ubuntu 18 * updating docker images of r,jekyll,julia etc test containers+ fix java version to 8 * installing libomp.so * removing debug test as its not required. Code clean-up * adding alternate URL source for MNIST dataset as original website is down * skipping flaky tests issue tracked apache#20011 Co-authored-by: Rohit Kumar Srivastava <[email protected]>
@samskalicky once this PR merges our binaries will be apache compliant |
@samskalicky can you review ? |
Co-authored-by: Rohit Kumar Srivastava <[email protected]>
@leezu should we upgrade the CI in 1.x to Ubuntu 18 from 16? I thought we were only doing that for master/2.0 and later |
@samskalicky its already merged in v1.x |
@mxnet-bot run ci [unix-gpu] |
Jenkins CI successfully triggered : [unix-gpu] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments.
cd/Jenkinsfile_release_job
Outdated
@@ -42,8 +42,8 @@ pipeline { | |||
// Using string instead of choice parameter to keep the changes to the parameters minimal to avoid | |||
// any disruption caused by different COMMIT_ID values chaning the job parameter configuration on | |||
// Jenkins. | |||
string(defaultValue: "mxnet_lib", description: "Pipeline to build", name: "RELEASE_JOB_TYPE") | |||
string(defaultValue: "cpu,native,cu100,cu101,cu102,cu110", description: "Comma separated list of variants", name: "MXNET_VARIANTS") | |||
string(defaultValue: "mxnet_lib/static", description: "Pipeline to build", name: "RELEASE_JOB_TYPE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be mxnet_lib, we removed "/static" in a previous commit dd4661a#diff-dd43bbf192e508d18e340337cf5a6094e137fba710759718cfbde6cf38e27a54R45
cd/utils/artifact_repository.md
Outdated
@@ -17,7 +17,7 @@ | |||
|
|||
# Artifact Repository - Pushing and Pulling libmxnet | |||
|
|||
The artifact repository is an S3 bucket accessible only to restricted Jenkins nodes. It is used to store compiled MXNet artifacts that can be used by downstream CD pipelines to package the compiled libraries for different delivery channels (e.g. DockerHub, PyPI, Maven, etc.). The S3 object keys for the files being posted will be prefixed with the following distinguishing characteristics of the binary: branch, commit id, operating system, variant and dependency linking strategy (static or dynamic). For instance, s3://bucket/73b29fa90d3eac0b1fae403b7583fdd1529942dc/ubuntu16.04/cu92mkl/static/libmxnet.so | |||
The artifact repository is an S3 bucket accessible only to restricted Jenkins nodes. It is used to store compiled MXNet artifacts that can be used by downstream CD pipelines to package the compiled libraries for different delivery channels (e.g. DockerHub, PyPI, Maven, etc.). The S3 object keys for the files being posted will be prefixed with the following distinguishing characteristics of the binary: branch, commit id, operating system, variant and dependency linking strategy (static or dynamic). For instance, s3://bucket/73b29fa90d3eac0b1fae403b7583fdd1529942dc/ubuntu16.04/cu100/static/libmxnet.so |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's ubuntu18.04 now in s3 folders
RUN /work/deb_ubuntu_ccache.sh | ||
|
||
COPY install/ubuntu_python.sh /work/ | ||
COPY install/requirements /work/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this, duplicate of line 25
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for making the release ASF compliant.
…pache#20015) * [BACKPORT]Enable CUDA 11.0 on nightly + CUDA 11.2 on pip (apache#19295)(apache#19764) (apache#19930) * Enable CUDA 11.0 on nightly development builds (apache#19295) Remove CUDA 9.2 and CUDA 10.0 * [PIP] add build variant for cuda 11.2 (apache#19764) * adding ci docker files for cu111 and cu112 * removing previous CUDA make versions and adding support for cuda11.2 Co-authored-by: waytrue17 <[email protected]> Co-authored-by: Sheng Zha <[email protected]> Co-authored-by: Rohit Kumar Srivastava <[email protected]> * [FEATURE]Migrating all CD pipelines to Ninja build + fix cu112 CD pipeline (apache#19974) * migrating cd builds to ninja + removing static links to nvidia libs and leagacy cuda versions * installing NCCL manually for cuda11.2 container * set MSHADOW_USE_CUDNN=1 in CMakelists of mshadow to build properly for CUDNN support * adding coverage to cd requirements file to fix cu100, cu101 and cu102 tests * updating cd_test containers to ubuntu 18 * adding cmake config for linux native and adding USE_KV_STORE in linux_cpu * updating zmq builds to statically link to libmxnet.so * updating toolchains for r, clang and llvm for ubuntu18. OpenBlas Static link for 'distribution' build type only. Fix caffe build to use openCV 3. Remove leagacy Clang 3.9 from CI * fix versions for pip install in ubuntu_core_sh add new search path for cuDNN * finxing cudnn link problem for CUDA<=11.0 * adding library paths for libjpegturbo and lapack to fix failing CI on ubuntu 18 images * removing ASAN integration test from miscellaneous CI as its not required * fix lapack path for gpu builds * correctly installing libjpegturbo for ubuntu 18 * updating docker images of r,jekyll,julia etc test containers+ fix java version to 8 * installing libomp.so * removing debug test as its not required. Code clean-up * adding alternate URL source for MNIST dataset as original website is down * skipping flaky tests issue tracked apache#20011 Co-authored-by: Rohit Kumar Srivastava <[email protected]> * update cudnn from 7 to 8 for cu102 (apache#19506) * update cudnn from 7 to 8 for cu102 (apache#19522) * downloading MNIST dataset from alternate URL (apache#20014) Co-authored-by: Rohit Kumar Srivastava <[email protected]> * fixing CI issue with v1.8.x * addressing review comments Co-authored-by: waytrue17 <[email protected]> Co-authored-by: Sheng Zha <[email protected]> Co-authored-by: Rohit Kumar Srivastava <[email protected]> Co-authored-by: Manu Seth <[email protected]>
Description
Backport PRs #20014 #19930 #19974 #19506 #19522
Checklist
Essentials
Testing
Tested on local CD pipeline identical to the one for v1.8.x: https://jenkins.mxnet-ci.amazon-ml.com/job/restricted-mxnet-cd/job/rohit_v1.8.x/