ci : reduce (disable SYCL and CANN builds/releases)#23705
Conversation
[no ci]
[no ci]
|
@arthw @hipudding PTAL - I am planning to disable the SYCL and CANN builds and releases until we provision more resources. @IMbackK In case you have some ideas about the ROCm/HIP builds. These are probably likely to stay for now, but if the CI continues to be slow, we'll likely have to remove those too. Btw, I see that for ROCm we only create a Linux release and for HIP we create only a Windows release. Why is that? I.e. why not create both Linux/Windows for both? |
|
@ggerganov HIP is a programming language of which ROCm is an implementation. As for why the linux build is called rocm and the windows build is called HIP, i have no idea, we should probably just call both of them ROCm as we dont actually support running the HIP backend on platforms other than ROCm anymore (ie hip-cpu or hip-nvidia) I dont really have any ideas for reducing the CI impact of the HIP backend. Really we should be doing more builds of it, not less, as currently we dont build for all targets until release time, which has cause release time build failures before as we have plenty of ifdefe'd code paths that are only compiled on specific targets. I dont see any way here other than acquiring more resources. |
|
Ok, thanks. I'll fix the naming to use ROCm for both. Edit: will keep the names for now. |
CISC
left a comment
There was a problem hiding this comment.
We could also disable caching of the oneAPI toolkit at the risk of download failing...
|
We can try this from |
|
@ggerganov SYCL CI has been separated from the build.yml, other backend PR won't trigger SYCL CI action. Lots of windows users use the binary package directly. Thank you! |
|
Maybe we could reduce the CI workload. |
|
How long do the SYCL jobs take without the cache? |
|
about 20 mins: main time is download oneAPI and install it locally. |
|
In a pure CUDA code changed PR: #23349 But only CUDA jobs is useful in fact. Note, there is no SYCL job running for this PR. |
|
In the case of that pr its actually only the hip jobs that are useful not the cuda ones altho separating if a change affects the hip backend the cuda backend or both is beyond a ci script. |
* origin/master: (59 commits) ggml-zendnn : fixed naming of matmul function (ggml-org#20964) ci : do not allocate ccache for 3rd-party hosted runners (ggml-org#23730) ci : move [no release] check to dedicated check_release job (ggml-org#23734) ci : add `[no release]` keyword + fix sanitizer builds (ggml-org#23728) ci : move macos jobs to the apple workflow + fix names (ggml-org#23721) vulkan: optimize conv2d and implement coopmat1 support (ggml-org#22620) ci : remove vulkan SDK dep from webgpu job (ggml-org#23718) hexagon: add support for CONCAT op (ggml-org#23648) ci : move more CPU jobs to self-hosted runners (ggml-org#23715) ci : move sanitizer jobs to self-hosted runners (ggml-org#23713) ci : reduce (disable SYCL and CANN builds/releases) (ggml-org#23705) convert : support Gemma4ForCausalLM architecture (ggml-org#23682) models : Attach Mistral3 NVFP4 weight scales (ggml-org#23629) SYCL: implement ggml_sycl_pool_vmm (ggml-org#22862) tests: test-backend-ops -j <N> to run tests in parallel (ggml-org#23637) model : add support for talkie-1930-13b (ggml-org#22596) ggml-webgpu: Add MMVQ path for Q4/Q8/Q2_K/Q4_K and clean up legacy MUL_MAT pipeline (ggml-org#23594) [WebGPU] Check batch_compute_passes before sending passes when not doing GPU profiling (ggml-org#23457) CUDA: missing PDL sync for FWHT, better fallback (ggml-org#23690) metal : add apple device id (ggml-org#23566) ...
* ci : reduce [no ci] * cont : disable sycl, cann + rename caches [no ci] * cont : cann [no ci]
* ci : reduce [no ci] * cont : disable sycl, cann + rename caches [no ci] * cont : cann [no ci]
|
At Qt Creator we save the This bypasses the 10 GB cache that GitHub has, since you can have unlimited space for build artifacts 😅 See https://github.com/qt-creator/qt-creator/blob/master/.github/workflows/build_cmake.yml#L556 for details. The build artifacts are short lived and I think it doesn't affect GitHub's disk space that much. I think it won't be too hard for an LLM to convert the CMake code to something else used by llama.cpp's CI build yaml files. |
|
Please return SYCL. |
|
@ggerganov @cics Thank you! |
|
@arthw Would need to see the implementation and the performance to decide. You can give it a try in a fork and when you have something working I'll take a look. |
|
Thanks a lot, having the Windows SYCL builds back again would be awesome. |
|
@ggerganov Thank you! |
|
Will we get Linux SYCL back? |
Overview
I believe we are trashing the Github Actions cache too much lately which is causing slow CI overall. This PR aims to disable some of the builds with the goal to lift some of the cache and runners pressure.
SYCLbuilds alone consume more than 1/3 of the total 10GB cache that we have. I don't think it's reasonable, so disabling them for now. In order to re-enable, we have to provision dedicated runners.openEulerbuilds are not consuming cache which is good. However, they allocate slots from the GH hosted runners. I'd like to move these builds to dedicated runners too.TODO
Additional information
Also prefixed the caches with
cache-gha-to be able to search and match easily.Requirements