-
Notifications
You must be signed in to change notification settings - Fork 992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grin node 5.1.0 won't launch on Linux #3641
Comments
I just tried compiling from source on a Void Linux machine. Still unable to run the node - here's a sanitized verbose stacktrace (the compilation steps are truncated for brevity's sake):
|
The failing assert while constructing a new GlobalConfig is from here: Lines 345 to 348 in 9e27e6f
and related to the rewritten grin-server.toml configuration file. |
Did you setup the GRIN server using
and started it as shown in
|
Yes, I have been using the same grin-server.toml essentially since the network launch. Specifying the config as such returns the same error I posted above:
|
Keep in mind there is a config change in |
Okay, prepending |
But to share my problems on On What I ended up doing was upgrading all nodes to latest Debian 10 (buster) from Debian 9 using:
Then I updated rust using
and compiled sucessfully on Debian 10 with:
current Debian 10:
|
Related: #3634 @trevyn Is there an edge case here where the config file migration is not behaving as expected? Are we handling the case where the config file is elsewhere? I wonder if we are potentially (re)writing in a different location. Edit: @phyro pointed out the actual replace logic is maybe a little brittle as it makes some assumptions about file contents -
We want to support minimal config files etc that may not necessarily have that comment in there. |
To clarify: adding the Following up on Debian 10, adding that changed nothing. Then, removing the old server toml, generating a new one via Please let me know if I can provide any further debug info. |
@nthrow grin-server.toml has two settings to control the level of logging: |
Same issue on Debian 11 (bullseye). I have "config_file_version = 2" line at the start of my grin-server.toml file. |
so we have an avx512 instruction?
|
It looks like this is because we use the "-march=native" flag when building croaring.
|
I cannot reproduce this issue. On debian 10.8, I built What version of |
I dont believe this is related to linux version or rust version. I think we are building an avx256 instruction into the binary (because croaring is built using native arch, and the CI/CD build system used to produce the grin release binary supports that instruction) which is not supported by very old CPU. It looks like we used to keep our own fork of croaring, and in that fork we fix this issue: We de-forked it here: In the upstream croaring project I see:
so that should allow us to pass in the arch we want to target. |
So, what CPU is @nthrow using? And do we want to bother supporting systems with such old CPUs? |
I'm not sure that's the issue as I'm currently on an AMD Ryzen 5 PRO 4650U. |
My CPU is Intel i5-9400F |
Ideally, a binary would check if avx2 support is present before using those instructions. But if croaring is lacking the smarts to do that, then we should avoid avx2 altogether. There's still a significant number of CPUs without avx2 that I think we should support. |
Can confirm the binary isn't working on my i7 running Debian 10.9 in docker. But compiling from source works. I suspect a croaring issue. |
There may be more than one issue here? I found that the first time running the new binary on an old cpu gives the "illegal instruction" crash.
|
I think I read in the past that "march" isnt the only compiler flag that impacts which instructions are used. The optimization setting (-On) can also cause AVX2 instructions to be used. Here is a blurb that says something like that?
Note that it appears croaring uses -O3 ? I also see that croaring (under the rs wrapper) supports a flag for not using AVX:
We should try using those. |
Thanks for investigating, Blade. +1 on using ROARING_DISABLE_AVX I guess that will add a -mno-avx flag to gcc. |
Good catch @antiochp. This is now fixed by #3644. This binary should run fine on linux https://github.com/quentinlesceller/grin/releases/tag/v5.1.0-test3. |
Describe the bug
Grin node v5.1.0 will not launch on Debian Buster, returning "Illegal instruction" to stdout.
To Reproduce
Relevant Information
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: