Skip to content

Allow failing substituters#8983

Closed
bouk wants to merge 1 commit intoNixOS:masterfrom
bouk:bouk/try-all-subs
Closed

Allow failing substituters#8983
bouk wants to merge 1 commit intoNixOS:masterfrom
bouk:bouk/try-all-subs

Conversation

@bouk
Copy link
Member

@bouk bouk commented Sep 15, 2023

Motivation

Currently any failing substituter causes nix to fail a build. Here I'm changing it so it will try all subs before failing, even when fallback = false

@Ericson2314 @SuperSandro2000 this replaces #7188, and I really need this myself so I'll make sure to get it over the line.

I still need to work on a test, I think I can modify binary-cache.sh test to make this work.

Priorities

Add 👍 to pull requests you find important.

@bouk bouk requested a review from thufschmitt as a code owner September 15, 2023 08:18
@github-actions github-actions bot added the store Issues and pull requests concerning the Nix store label Sep 15, 2023
@bouk bouk changed the title Allow failing substitutors Allow failing substituters Sep 15, 2023
@bouk
Copy link
Member Author

bouk commented Sep 15, 2023

Can someone help me with writing the test? I can't seem to figure out how to make the file substituter fail in the right way, if I duplicate the corrupt NAR test and add and extra substituter it also succeeds on master

@kristoff3r
Copy link

Can someone help me with writing the test? I can't seem to figure out how to make the file substituter fail in the right way, if I duplicate the corrupt NAR test and add and extra substituter it also succeeds on master

Can't you just add a substituter that doesn't exist as the first one? That's what's currently preventing me from updating one of my machines, which is how I found this PR xD

@thufschmitt
Copy link
Member

@Ericson2314 sending that one over to you as you were trying to push the previous attempt forward

@kristoff3r
Copy link

I tried making a test for this, but I had a hard time. I tried doing this:

clearStore
clearCacheCache

nix-instantiate --store "file://$cacheDir" dependencies.nix
nix-build --option fallback false --substituters "http://192.0.2.1 file://$cacheDir" --no-require-sigs dependencies.nix -o $TEST_ROOT/result

I ran into 2 different problems:

  • If I understand the manual correctly nix-build should fail if fallback is false and the binary substitution fails. However when I run the test above on master it does fall back to building the derivation:
    +(binary-cache.sh:293) nix-build --option fallback false --substituters 'http://192.0.2.1 file:///tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/binary-cache' --no-require-sigs dependencies.nix -o /tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/result
    warning: error: unable to download 'http://192.0.2.1/nix-cache-info': Couldn't connect to server (7); retrying in 271 ms
    warning: error: unable to download 'http://192.0.2.1/nix-cache-info': Couldn't connect to server (7); retrying in 644 ms
    warning: error: unable to download 'http://192.0.2.1/nix-cache-info': Couldn't connect to server (7); retrying in 1272 ms
    warning: error: unable to download 'http://192.0.2.1/nix-cache-info': Couldn't connect to server (7); retrying in 2345 ms
    warning: unable to download 'http://192.0.2.1/nix-cache-info': Couldn't connect to server (7)
    these 5 derivations will be built:
      /tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/2nwb5ydngd146v4mqvy3d6h097dhw4q1-dependencies-input-0.drv
      /tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/gg7vfl8n7d8hzb0xkz8qwvpxgfs4bys8-fod-input.drv
      /tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/ip6w7y72nff8chd4ck7n3n3b3m7rv9q3-dependencies-input-2.drv
      /tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/m3kizj7wn0x58ac05hvy0f5r94lrwbrj-dependencies-input-1.drv
      /tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/ja11p3wy8qsr2v68mfp6ga18vcz57wzr-dependencies-top.drv
    building '/tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/2nwb5ydngd146v4mqvy3d6h097dhw4q1-dependencies-input-0.drv'...
    building '/tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/m3kizj7wn0x58ac05hvy0f5r94lrwbrj-dependencies-input-1.drv'...
    building '/tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/ip6w7y72nff8chd4ck7n3n3b3m7rv9q3-dependencies-input-2.drv'...
    building '/tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/gg7vfl8n7d8hzb0xkz8qwvpxgfs4bys8-fod-input.drv'...
    
    building '/tmp/nix-shell.NhWcCH/nix-test/tests/binary-cache/store/ja11p3wy8qsr2v68mfp6ga18vcz57wzr-dependencies-top.drv'...

Is there a better way to force it to use substitution for a test like this?

  • When I tried running the test using this PR I get basically the same result (i.e. the first substituter fails, then triggers a build), suggesting that my test does something wrong or this PR doesn't actually fix the issue

@fricklerhandwerk fricklerhandwerk added the UX The way in which users interact with Nix. Higher level than UI. label Sep 21, 2023
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-09-22-nix-team-meeting-minutes-88/33343/1

@szlend
Copy link
Member

szlend commented Dec 18, 2023

So if a single substituter is down and a derivation is not cached on any of the other subsituters, it will still fail right? Just making sure because allowing failing substituters by falling back to building from source would break our workflow. We host the Nix binary cache under a VPN, so if a substituter fails, it's a good reminder to re-connect to the VPN.

@bouk
Copy link
Member Author

bouk commented Apr 22, 2024

I don't have time to continue on this unfortunately, so I will close it

@bouk bouk closed this Apr 22, 2024
Ericson2314 pushed a commit that referenced this pull request Sep 12, 2025
… are still enabled (#13301)

## Motivation

Nix currently hard fails if a substituter is inaccessible, even when they are other substituters available, unless `fallback = true`. 
This breaks nix build, run, shell et al entirely. 
This would modify the default behaviour so that nix would actually use the other available substituters and not hard error.

Here is an example before vs after when using dotenv where I have manually stopped my own cache to trigger this issue, before and after the patch. The initial error is really frustrating because there is other caches available.
![image](https://github.com/user-attachments/assets/b4aec474-52d1-497d-b4e8-6f5737d6acc7)
![image](https://github.com/user-attachments/assets/ee91fcd4-4a1a-4c33-bf88-3aee67ad3cc9)

## Context

#3514 (comment) is the earliest issue I could find, but there are many duplicates.

There is an initial PR at #7188, but this appears to have been abandoned - over 2 years with no activity, then a no comment review in jan. There was a subsequent PR at #8983 but this was closed without merge - over a year without activity.
<!-- Non-trivial change: Briefly outline the implementation strategy. -->
I have visualised the current and proposed flows. I believe my logic flows line up with what is suggested in #7188 (comment) but correct me if I am wrong.
Current behaviour:
![current](https://github.com/user-attachments/assets/d9501b34-274c-4eb3-88c3-9021a482e364)
Proposed behaviour:
![proposed](https://github.com/user-attachments/assets/8236e4f4-21ef-45d7-87e1-6c8d416e8c1c)

[Charts in lucid](https://lucid.app/lucidchart/1b51b08d-6c4f-40e0-bf54-480df322cccf/view)
<!-- Invasive change: Discuss alternative designs or approaches you considered. -->

Possible issues to think about:
- I could not figure out where the curl error is created... I can't figure out how to swallow it and turn it into a warn or better yet, a debug log.
- Unfortunately, in contrast with the previous point, I'm not sure how verbose we want to warns/traces to be - personally I think that the warn that a substituter has been disabled (when it happens) is sufficient, and that the next one is being used, but this is personal preference.
philipwilk added a commit to philipwilk/nix that referenced this pull request Sep 13, 2025
… are still enabled (NixOS#13301)

Nix currently hard fails if a substituter is inaccessible, even when they are other substituters available, unless `fallback = true`.
This breaks nix build, run, shell et al entirely.
This would modify the default behaviour so that nix would actually use the other available substituters and not hard error.

Here is an example before vs after when using dotenv where I have manually stopped my own cache to trigger this issue, before and after the patch. The initial error is really frustrating because there is other caches available.
![image](https://github.com/user-attachments/assets/b4aec474-52d1-497d-b4e8-6f5737d6acc7)
![image](https://github.com/user-attachments/assets/ee91fcd4-4a1a-4c33-bf88-3aee67ad3cc9)

There is an initial PR at NixOS#7188, but this appears to have been abandoned - over 2 years with no activity, then a no comment review in jan. There was a subsequent PR at NixOS#8983 but this was closed without merge - over a year without activity.
<!-- Non-trivial change: Briefly outline the implementation strategy. -->
I have visualised the current and proposed flows. I believe my logic flows line up with what is suggested in NixOS#7188 (comment) but correct me if I am wrong.
Current behaviour:
![current](https://github.com/user-attachments/assets/d9501b34-274c-4eb3-88c3-9021a482e364)
Proposed behaviour:
![proposed](https://github.com/user-attachments/assets/8236e4f4-21ef-45d7-87e1-6c8d416e8c1c)

[Charts in lucid](https://lucid.app/lucidchart/1b51b08d-6c4f-40e0-bf54-480df322cccf/view)
<!-- Invasive change: Discuss alternative designs or approaches you considered. -->

Possible issues to think about:
- I could not figure out where the curl error is created... I can't figure out how to swallow it and turn it into a warn or better yet, a debug log.
- Unfortunately, in contrast with the previous point, I'm not sure how verbose we want to warns/traces to be - personally I think that the warn that a substituter has been disabled (when it happens) is sufficient, and that the next one is being used, but this is personal preference.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

store Issues and pull requests concerning the Nix store UX The way in which users interact with Nix. Higher level than UI.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

7 participants