Limit time to live for cache unavailability during ci command#201
Limit time to live for cache unavailability during ci command#201dg0yt wants to merge 1 commit intomicrosoft:mainfrom
Conversation
|
I think this is unlikely to change much about how the pool operates:
We do think this is a problem we would like to do better at; introducing randomized order may help but truly solving it is going to need more cross-node communication. (Or some more creative scheme like looking at what ports are edited with git rather than completing the whole ABI list) |
Yes, agreed. That's my "same set of unavailable ports". Focusing on vcpkg PRs, the first step should be to reduce the set of packages to build. |
Local simulation (cache poorly filled): TRIPLET=x64-osx
REF=042e1db92d115819bba6bffd681a174543111139
git checkout $REF~1
./vcpkg ci --dry-run $TRIPLET | grep ":$TRIPLET: \*" | sed -e 's,^.* : \([a-f0-9]*\)$,/\1/ d,' > parent-changes
wc -l parent-changes
git checkout $REF
./vcpkg ci --dry-run $TRIPLET | grep ":$TRIPLET: \*" > head-changes
wc -l head-changes
cat head-changes | sed -f parent-changes | sed -e 's,^ *\([^:]*\).*,\1,' > install-list
wc -l install-list
./vcpkg install --dry-run @install-list | grep ' ->' > actually-installed
wc -l actually-installedwith the following output: showing the number of packages going down from 1629 to 545 (affected by PR) / 878 (totally installed). As expected with dependencies, this is still much more than the number of packages actually using vcpkg_cmake_config_fixup: This would also increase the robustness of PR CI results with regard to errors in unrelated ports. (Edit: Explicitly include the effect of |
|
You might find #167 useful, which will directly write out the ci check results to a json doc so you can use jq & friends |
Thanks. I saw that PR. Machine readable output is welcome. And I even looked at the jq intro. But is it installed on CI machines? But will the idea work: skip "unrelated" hashes for PR CI? |
|
I think the premise is good -- however for the actual implementation I think the best approach would be something like: $ ./bootstrap-vcpkg
$ git checkout HEAD~1
$ ./vcpkg ci --output-hashes=parent_hashes.json --dry-run
$ git checkout HEAD@{1}
$ ./vcpkg ci --with-parent-hashes=parent_hashes.jsonThen, the ci command can include the parent hashes in its delta calculation it's already performing w.r.t. the binary cache. We may want to modify dry-run to skip checking the binary cache in order to avoid hitting it twice. |
I see. My previours script ended with an install command, so it would hit the cache even three times. But we can leave the cache checking entirely to the install command: We only need the hashes from the ci command. Output: (This is a few ports more, just because I have them in cache which is now ignored for determining head-hashes.) And now correct me if I'm wrong: |
|
Closing. The parent-hashes idea is submitted in #210 now. |
Picking up an issue and an idea I discussed in a previous conversation with @BillyONeal:
Attaching time-to-live to the binary cache used during CI may allow rechecking availability without overloading the cache provider.
It is not a perfect solution. Without randomization in build order, parallel builds may run into the same set of unavailable ports with lonbg build times.
This PR compiles, but wasn't tested or verified so far.