Skip to content
This repository was archived by the owner on May 9, 2024. It is now read-only.

[CI] Reproducibility of test failure #351

Open
Devjiu opened this issue Apr 4, 2023 · 5 comments
Open

[CI] Reproducibility of test failure #351

Devjiu opened this issue Apr 4, 2023 · 5 comments
Labels

Comments

@Devjiu
Copy link
Contributor

Devjiu commented Apr 4, 2023

Developer build and workflow differs from ci one a lot.

e.g. build.yml uses to (after conda set up, cpu case) build project: omniscidb/scripts/conda/build.sh, which set some environment variables inside in not transparent way. Developer using in same condition simply: cmake .. && make -j 32 && make install

pytest.yml and modin.yml also use different approach to build - it uses: $CONDA/bin/conda run -n ${{ env.CONDA_ENV }} sh -c "cmake .. -DENABLE_CUDA=off -DENABLE_CONDA=on -DENABLE_PYTHON=on -DCMAKE_BUILD_TYPE=release && make -j2 && make install"

test.yml (test-docker, test-l0-docker jobs also) uses to run sanity tests some omniscidb/scripts/conda/test.sh which uses get_cxx_include_path.sh and also sets some env variables. Developer simply uses make sanity_tests,

test-l0-docker.yml uses for sanity tests omniscidb/scripts/conda/intel-gpu-enabling-test.sh.

All this hiding makes fails reported by CI difficult to reproduce. In addition, duplication (cache/build) requires several CI code updates to keep it consistent and support new features for build and test.

@Devjiu Devjiu added the tests label Apr 4, 2023
@Garra1980
Copy link
Contributor

Agree, at least difference in a way we run build in CI and the one described here - https://github.com/intel-ai/hdk#build - bothers me as well

@alexbaden
Copy link
Contributor

Is there a specific failure that has been hard to reproduce? Other than the conda-forge build problems or differences in packages across the CI environments we currently test in, I have not experienced a failure that can be linked to build differences between my environment and the CI.

@Devjiu
Copy link
Contributor Author

Devjiu commented Apr 5, 2023

Is there a specific failure that has been hard to reproduce? Other than the conda-forge build problems or differences in packages across the CI environments we currently test in, I have not experienced a failure that can be linked to build differences between my environment and the CI.

Currently docker build looks difficult to work with, as for me. But this issue can be resolved with docker.io that already in todo list. So let's don't count them.

My point is that there are a lot of hidden environment variable changes, so there are a lot of places where you have to look for the missing configuration in case of failure.

I will share build examples when I run into issues like this.

You currently have several points in code, that changing environment, you pointing out, that you have not faced issues. Does it mean that have multiple places to setup env is fine?
I can write code that works but is hard to maintain. Let me write everything in one line to one file, and when you tell me that it is difficult to work with, I will say that I have no problems.

@Devjiu
Copy link
Contributor Author

Devjiu commented Apr 11, 2023

One of the mystic failures, that was fixed for unknown reason:

  1. On Saturday (8.04.23) nightly sanity tests failed in docker with cuda https://github.com/intel-ai/hdk/actions/runs/4643017334/jobs/8217872122#step:8:4233
  2. On the same day, it was retriggered and successfully passed. https://github.com/intel-ai/hdk/actions/runs/4643017334/jobs/8219242858

It means, that our testing can produce fake postive/negative on unknow reason.
[Upd] Issued with race condition on some of tests. (smth like JoinHashTable, not related with configuration, can be reproduced with increasing of threads number on any)

@Devjiu
Copy link
Contributor Author

Devjiu commented Apr 11, 2023

PR #369 increases reproducibility as soon as it possible to pull docker image (cuda/l0) from https://hub.docker.com/r/dataved/build.cuda/tags .

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants