Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

CI: cran broken #18042

Closed
leezu opened this issue Apr 13, 2020 · 2 comments
Closed

CI: cran broken #18042

leezu opened this issue Apr 13, 2020 · 2 comments
Labels

Comments

@leezu
Copy link
Contributor

leezu commented Apr 13, 2020

[2020-04-13T16:57:58.011Z] W: GPG error: http://cran.rstudio.com/bin/linux/ubuntu trusty/ Release: The following signatures were invalid: BADSIG 51716619E084DAB9 Michael Rutter <[email protected]>

[2020-04-13T16:57:58.011Z] W: The repository 'http://cran.rstudio.com/bin/linux/ubuntu trusty/ Release' is not signed.

[2020-04-13T16:57:58.011Z] E: Failed to fetch store:/var/lib/apt/lists/partial/cran.rstudio.com_bin_linux_ubuntu_trusty_Packages.gz  Hash Sum mismatch

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Fclang/detail/PR-17984/35/pipeline

@leezu leezu added the Bug label Apr 13, 2020
leezu added a commit to leezu/mxnet that referenced this issue Apr 13, 2020
@leezu
Copy link
Contributor Author

leezu commented Apr 13, 2020

Suggest to use system provided R instead of relying on external repos on CI.

To test newer versions of R, we can switch Ubuntu 16.04 -> 18.04.

In general, relying on the system dependencies will enable us to make our CI more reliable

leezu added a commit to leezu/mxnet that referenced this issue Apr 13, 2020
leezu added a commit to leezu/mxnet that referenced this issue Apr 13, 2020
leezu added a commit that referenced this issue Apr 14, 2020
As per #17968, require C++17 compatible compiler. For cuda code, use C++14 mode introduced in Cuda 9. C++17 support for Cuda will be available in Cuda 11.

Switching to C++17 requires modernizing the toolchain, which exposed a number  of technical debt issues in the codebase. All blocking issues are fixed as part of this PR. See the full list below.

This PR contains the following specific changes:

    Switch CI pipeline to use gcc7 on Ubuntu and CentOS
    Switch CD pipeline to CentOS 7 with https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/ This enables us to build with gcc7 C++17 compiler while keeping a relatively old glibc requirement for distribution.
    Simplify ARM Edge builds
        Switch to standard Ubuntu / Debian cross-compilation toolchain for ARMv7, ARMv8
        Switch to https://toolchains.bootlin.com/ toolchain for ARMv6 (the Debian ARMv6 toolchain is for ARMv4 + ARMv5 + ARMv6, but we wish to only target ARMv6 and make use of ARMv6 features)
        Remove reliance on dockcross for cross compilation.
    Simplify Jetson build
        Use standard Ubuntu / Debian cross-compilation toolchain for ARMv8
        Upgrade to Cuda 10 and Jetpack 4.3
        Simplify build setup
    Simplify QEMU ARM virtualization test setup on CI
        Remove complex "Virtual Machine in Docker" logic and run a QEMU based Docker container instead based on arm32v7/ubuntu
    Fix out of bounds vector accesses in
        SoftmaxGradOpType
        MKLDNNFCBackward
    Fix use of non-standard rand_r function (which is not available on anymore on newer Android toolchains and shouldn't be use in any case).
    Fix reproducibility of RNN with Dropout
    Fix reproducibility of DGL Graph Sampling Operators
    Update tests for Android Edge build to NDK19. The previously used standalone toolchain is obsolete.

Those Dockerfiles that required refactoring as part of the effort were refactored based on the following consideration

    Maximize the use of system dependencies provided by the distribution instead of manually installing dependencies from source or from third party vendors. This reduces the complexity of the installation process and essentially pins the dependency versions, increasing CI stability. Further, Dockerfile build speed is improved. To facilitate this, use recent distribution versions. We still ensure backwards compatibility via CentOS7 based build and test stages
    Minimize the number of layers in the Dockerfile. Don't have 5 different script files executed, each calling apt-get update and install, but just execute once. Speeds up the build and reduces image size. Keep each Dockerfile simple and tailored to a purpose, instead of running 20 scripts to install dependencies for every thinkable scenario, which is unmaintainable.

Some more small changes:

    Remove outdated references to Cuda 7 and Cuda 8 in various files.
    Remove C++03 support in mshadow
    Disable broken tests
        NumpyBooleanAssignForwardCPU #17990
        test_init.test_rsp_const_init #17988
        quantized_elemwise_mul #18034

List of squashed commits

* cpp standard

* Remove leftover files of Cuda 7 and Cuda 8 support

* thrust 1.9.8 for clang10

* compiler warnings

* Disable broken test_init.test_rsp_const_init

* Disable tests invoking NumpyBooleanAssignForwardCPU

* Fix out of bounds access in SoftmaxGradOpType

* Use CentOS 7 for staticbuilds

CentOS 7 fullfills the requirements for PEP 599 manylinux-2014 and provides a
C++17 toolchain.

* Fix MKLDNNFCBackward

* Update edge toolchain

* Support platforms without rand_r

* Cleanup random.h

* Greatly simplify qemu setup

* Remove unused functions in Jenkins_steps.groovy

* Skip quantized_elemwise_mul due QuantizedElemwiseMulOpShape bug

* Fix R package installation

#18042

* Fix centos ccache

* Fix GPU Makefile staticbuild on CentOS7

* CentOS7 NCCL

* CentOS7 staticbuild fix link with libculibos
@leezu leezu closed this as completed Apr 14, 2020
AntiZpvoh pushed a commit to AntiZpvoh/incubator-mxnet that referenced this issue Jul 6, 2020
As per apache#17968, require C++17 compatible compiler. For cuda code, use C++14 mode introduced in Cuda 9. C++17 support for Cuda will be available in Cuda 11.

Switching to C++17 requires modernizing the toolchain, which exposed a number  of technical debt issues in the codebase. All blocking issues are fixed as part of this PR. See the full list below.

This PR contains the following specific changes:

    Switch CI pipeline to use gcc7 on Ubuntu and CentOS
    Switch CD pipeline to CentOS 7 with https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/ This enables us to build with gcc7 C++17 compiler while keeping a relatively old glibc requirement for distribution.
    Simplify ARM Edge builds
        Switch to standard Ubuntu / Debian cross-compilation toolchain for ARMv7, ARMv8
        Switch to https://toolchains.bootlin.com/ toolchain for ARMv6 (the Debian ARMv6 toolchain is for ARMv4 + ARMv5 + ARMv6, but we wish to only target ARMv6 and make use of ARMv6 features)
        Remove reliance on dockcross for cross compilation.
    Simplify Jetson build
        Use standard Ubuntu / Debian cross-compilation toolchain for ARMv8
        Upgrade to Cuda 10 and Jetpack 4.3
        Simplify build setup
    Simplify QEMU ARM virtualization test setup on CI
        Remove complex "Virtual Machine in Docker" logic and run a QEMU based Docker container instead based on arm32v7/ubuntu
    Fix out of bounds vector accesses in
        SoftmaxGradOpType
        MKLDNNFCBackward
    Fix use of non-standard rand_r function (which is not available on anymore on newer Android toolchains and shouldn't be use in any case).
    Fix reproducibility of RNN with Dropout
    Fix reproducibility of DGL Graph Sampling Operators
    Update tests for Android Edge build to NDK19. The previously used standalone toolchain is obsolete.

Those Dockerfiles that required refactoring as part of the effort were refactored based on the following consideration

    Maximize the use of system dependencies provided by the distribution instead of manually installing dependencies from source or from third party vendors. This reduces the complexity of the installation process and essentially pins the dependency versions, increasing CI stability. Further, Dockerfile build speed is improved. To facilitate this, use recent distribution versions. We still ensure backwards compatibility via CentOS7 based build and test stages
    Minimize the number of layers in the Dockerfile. Don't have 5 different script files executed, each calling apt-get update and install, but just execute once. Speeds up the build and reduces image size. Keep each Dockerfile simple and tailored to a purpose, instead of running 20 scripts to install dependencies for every thinkable scenario, which is unmaintainable.

Some more small changes:

    Remove outdated references to Cuda 7 and Cuda 8 in various files.
    Remove C++03 support in mshadow
    Disable broken tests
        NumpyBooleanAssignForwardCPU apache#17990
        test_init.test_rsp_const_init apache#17988
        quantized_elemwise_mul apache#18034

List of squashed commits

* cpp standard

* Remove leftover files of Cuda 7 and Cuda 8 support

* thrust 1.9.8 for clang10

* compiler warnings

* Disable broken test_init.test_rsp_const_init

* Disable tests invoking NumpyBooleanAssignForwardCPU

* Fix out of bounds access in SoftmaxGradOpType

* Use CentOS 7 for staticbuilds

CentOS 7 fullfills the requirements for PEP 599 manylinux-2014 and provides a
C++17 toolchain.

* Fix MKLDNNFCBackward

* Update edge toolchain

* Support platforms without rand_r

* Cleanup random.h

* Greatly simplify qemu setup

* Remove unused functions in Jenkins_steps.groovy

* Skip quantized_elemwise_mul due QuantizedElemwiseMulOpShape bug

* Fix R package installation

apache#18042

* Fix centos ccache

* Fix GPU Makefile staticbuild on CentOS7

* CentOS7 NCCL

* CentOS7 staticbuild fix link with libculibos
Zha0q1 pushed a commit to Zha0q1/SMDDP-Examples that referenced this issue Aug 16, 2021
As per #17968, require C++17 compatible compiler. For cuda code, use C++14 mode introduced in Cuda 9. C++17 support for Cuda will be available in Cuda 11.

Switching to C++17 requires modernizing the toolchain, which exposed a number  of technical debt issues in the codebase. All blocking issues are fixed as part of this PR. See the full list below.

This PR contains the following specific changes:

    Switch CI pipeline to use gcc7 on Ubuntu and CentOS
    Switch CD pipeline to CentOS 7 with https://www.softwarecollections.org/en/scls/rhscl/devtoolset-7/ This enables us to build with gcc7 C++17 compiler while keeping a relatively old glibc requirement for distribution.
    Simplify ARM Edge builds
        Switch to standard Ubuntu / Debian cross-compilation toolchain for ARMv7, ARMv8
        Switch to https://toolchains.bootlin.com/ toolchain for ARMv6 (the Debian ARMv6 toolchain is for ARMv4 + ARMv5 + ARMv6, but we wish to only target ARMv6 and make use of ARMv6 features)
        Remove reliance on dockcross for cross compilation.
    Simplify Jetson build
        Use standard Ubuntu / Debian cross-compilation toolchain for ARMv8
        Upgrade to Cuda 10 and Jetpack 4.3
        Simplify build setup
    Simplify QEMU ARM virtualization test setup on CI
        Remove complex "Virtual Machine in Docker" logic and run a QEMU based Docker container instead based on arm32v7/ubuntu
    Fix out of bounds vector accesses in
        SoftmaxGradOpType
        MKLDNNFCBackward
    Fix use of non-standard rand_r function (which is not available on anymore on newer Android toolchains and shouldn't be use in any case).
    Fix reproducibility of RNN with Dropout
    Fix reproducibility of DGL Graph Sampling Operators
    Update tests for Android Edge build to NDK19. The previously used standalone toolchain is obsolete.

Those Dockerfiles that required refactoring as part of the effort were refactored based on the following consideration

    Maximize the use of system dependencies provided by the distribution instead of manually installing dependencies from source or from third party vendors. This reduces the complexity of the installation process and essentially pins the dependency versions, increasing CI stability. Further, Dockerfile build speed is improved. To facilitate this, use recent distribution versions. We still ensure backwards compatibility via CentOS7 based build and test stages
    Minimize the number of layers in the Dockerfile. Don't have 5 different script files executed, each calling apt-get update and install, but just execute once. Speeds up the build and reduces image size. Keep each Dockerfile simple and tailored to a purpose, instead of running 20 scripts to install dependencies for every thinkable scenario, which is unmaintainable.

Some more small changes:

    Remove outdated references to Cuda 7 and Cuda 8 in various files.
    Remove C++03 support in mshadow
    Disable broken tests
        NumpyBooleanAssignForwardCPU #17990
        test_init.test_rsp_const_init #17988
        quantized_elemwise_mul #18034

List of squashed commits

* cpp standard

* Remove leftover files of Cuda 7 and Cuda 8 support

* thrust 1.9.8 for clang10

* compiler warnings

* Disable broken test_init.test_rsp_const_init

* Disable tests invoking NumpyBooleanAssignForwardCPU

* Fix out of bounds access in SoftmaxGradOpType

* Use CentOS 7 for staticbuilds

CentOS 7 fullfills the requirements for PEP 599 manylinux-2014 and provides a
C++17 toolchain.

* Fix MKLDNNFCBackward

* Update edge toolchain

* Support platforms without rand_r

* Cleanup random.h

* Greatly simplify qemu setup

* Remove unused functions in Jenkins_steps.groovy

* Skip quantized_elemwise_mul due QuantizedElemwiseMulOpShape bug

* Fix R package installation

apache/mxnet#18042

* Fix centos ccache

* Fix GPU Makefile staticbuild on CentOS7

* CentOS7 NCCL

* CentOS7 staticbuild fix link with libculibos
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant