{bio}[foss/2023b] GROMACS v2024.3#21430
Conversation
|
@boegelbot please test @ jsc-zen3 |
|
@bedroge: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 2355567729 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @bedroge edit: oops, forgot to include the fix from easybuilders/easybuild-easyblocks#3283, ran into that before... |
|
Test report by @boegelbot |
|
@boegelbot please test @ generoso |
|
@bedroge: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 2355672238 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @boegelbot Test failure in GmxapiMpiTests: Let's try again... |
|
@boegelbot please test @ generoso |
|
@bedroge: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 2355854315 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @bedroge |
|
Test report by @boegelbot |
|
Also tested this with the EESSI bot for a bunch of CPUs: EESSI/software-layer#709. There it also failed on haswell with the same input/output error, so I've started another build. |
|
Test report by @boegel |
GROMACS dev here. I see that the following test case fails, either timing out or somehow suspended or crashed. C-rescale is a relatively new implementation, and this test case is intended to exercise dark corners of the code, so a real problem is possible. Yet I see the preceding test case (at https://gist.github.com/boegel/75ff6503735f73f2d9ec570366bd181f#file-gromacs-2024-3-foss-2023b_partial-log-L374) took 25 seconds. On my x86 laptop with a |
@mabraham It's probably not the GROMACS configuration itself, but the environment it's running it. It's running in an interactive Slurm job, with 9 cores available (in a cgroup) out of a total of 36 in total on that system. In addition, I've seen this before, but I never got to the bottom of it for GROMACS... If any of this rings a bell, any insights you may have are welcome. |
|
The test cases are only using two pthreads, so if the system is working as you describe, there's no ready explanation of a problem. But if the core-to-cgroup mapping is not working right, such slowdowns are plausible. Do you have / can you get data to observe core occupancy across a loaded node? |
|
Test report by @boegel |
|
The system I was testing on has been migrated from RHEL 8.8 to RHEL 9.4 since the last time I tested (17 Sept'24), and the last attempt didn't fail (see test report above) One difference there is that this test was done on a full workernode (all 36 cores assigned to the Slurm job), so there's no cgroup effect here. I also did an I'm now retesting in a 9-core Slurm job, where cgroup is set up such that available cores are spread across the node: I didn't see any failing tests after an If I keep Seems to be the same That's not a total surprise though, we've seen other situations where setting So long story short: friends don't let friends set @mabraham Not sure if it makes sense to integrate the "unset |
|
It certainly makes sense for you to integrate that unset call in your runner. By default, GROMACS does try to respect existing thread-affinity settings, but if it detects none, then it sets them itself. The main simulation engine itself has a command-line flag to specify behavior here, but the these test binaries just do the default. However the default only checks GOMP_CPU_AFFINITY, and no OMP_* variables, which looks like an omission. |
|
I made https://gitlab.com/gromacs/gromacs/-/issues/5170 to follow up |
|
Going in, thanks @bedroge! |
(created using
eb --new-pr)Compared to previous easyconfigs, this now installs the pypi version of gmxapi. The versioning of the included gmxapi seems a bit confusing: https://gitlab.com/gromacs/gromacs/-/blob/v2024.3/python_packaging/gmxapi/pyproject.toml?ref_type=tags says 0.4.1, https://gitlab.com/gromacs/gromacs/-/blob/v2024.3/python_packaging/gmxapi/src/gmxapi/version.py?ref_type=tags shows 0.5.0a1, and the docs just recommend using the pypi version (where the latest version is 0.4.2).