Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[CI Infrastructure] R-MKLDNN-CPU test run failure. gcc fails due to ccache issue? #19304

Closed
DickJC123 opened this issue Oct 6, 2020 · 3 comments
Labels

Comments

@DickJC123
Copy link
Contributor

DickJC123 commented Oct 6, 2020

Description

During the work on PR #19298, the following CI job failed in the windows-cpu R-MKLDNN-CPU test job: https://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/mxnet-validation%2Funix-cpu/detail/PR-19298/1/pipeline/304/. This doesn't appear related to the PR, but more with the CI infrastructure and its use of ccache. Despite the failure being part of the 'test running' phase, gcc was invoked. See below for relevant error message.

A retry of this job did not repeat the error.

Error Message

...
[2020-10-06T07:47:24.018Z] gcc -std=gnu99 -I/usr/share/R/include -DNDEBUG     -Iutf8lite/src -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c utf8lite/src/render.c -o utf8lite/src/render.o
...
[2020-10-06T07:47:24.018Z] ccache: error: /work/ccache/ccache.conf: No such file or directory
[2020-10-06T07:47:24.018Z] make[1]: *** [render.o] Error 1
[2020-10-06T07:47:24.018Z] make[1]: *** Waiting for unfinished jobs....
[2020-10-06T07:47:24.018Z] /usr/lib/R/etc/Makeconf:159: recipe for target 'render.o' failed
@leezu
Copy link
Contributor

leezu commented Oct 7, 2020

In v1.x branch, there is a global ccache configuration:

https://github.com/apache/incubator-mxnet/blob/3b69c607b611eb8f275aa8ed512a380ac0fd768d/ci/docker/runtime_functions.sh#L63-L111

We can just remove it's invocation in the R language tests:

https://github.com/apache/incubator-mxnet/blob/3b69c607b611eb8f275aa8ed512a380ac0fd768d/ci/docker/runtime_functions.sh#L1186-L1191

If that slows down the R test stage too much, one can investigate updating the R toolchain.

@leezu leezu added CI and removed needs triage labels Oct 7, 2020
leezu added a commit that referenced this issue Oct 7, 2020
)

#19304 reported flaky compilation failures related to CI ccache configuration.
@leezu leezu closed this as completed Oct 7, 2020
@leezu
Copy link
Contributor

leezu commented Oct 7, 2020

Closed via #19305

@DickJC123
Copy link
Contributor Author

Awesome, thanks for the quick turnaround!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants