Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up tests #682

Closed
wants to merge 3 commits into from
Closed

Speed up tests #682

wants to merge 3 commits into from

Conversation

radimkrcmar
Copy link

Tests shouldn't rebuild riscv_model_RVxx when we have already compiled it.

Detailed motivation is in the third patch. First two make it possible.

Still prints the failure for each test, which is a pointless noise, but
some tools might depend on it.

Signed-off-by: Radim Krčmář <[email protected]>
We know the tests are going to fail and tooling should be fixed if it
depends on the total amount of tests.

Signed-off-by: Radim Krčmář <[email protected]>
There is a bug in the build process if we need to make clean between
builds.  We originally needed make clean to avoid running test with the
old binary, but that has been fixed.

Run the tests without wasting time by rebuilding the binaries.

The tests now run much faster, so maybe we can finally require
```git rebase -x './tests/run_tests.sh' origin/master```
before submitting patches and do it for pull requests.
(Testing for regressions is slow even after this if the patch series is
 also supposed to be properly split.)

Signed-off-by: Radim Krčmář <[email protected]>
@jrtc27
Copy link
Collaborator

jrtc27 commented Jan 11, 2025

Well, there are more options to the build than just ARCH (which you're right is captured in the path these days). If you drop the clean then you're relying on existing builds to not be tainted with such options, and could end up with a mix of tainted (ones you built) and untainted (ones you didn't already build) builds being tested. So I'm personally against this.

What I'd be ok with is providing an opt-in option to the script that lets you skip the cleans.

Also, re:

The tests now run much faster, so maybe we can finally require
git rebase -x './tests/run_tests.sh' origin/master
before submitting patches and do it for pull requests.
(Testing for regressions is slow even after this if the patch series is
also supposed to be properly split.)

  1. They're only faster if you've already built the model and haven't changed anything. In many cases it will be just as slow, CI included, where it starts from a clean working directory with no builds.
  2. CI is already running run_tests.sh on PRs. Not for every commit, just the final result (pretty common for CI systems to do); if you wanted to do that instead then again it's unlikely to be much faster because, by definition, each commit is making changes, normally to the Sail sources, and so will have to rebuild the model for each commit regardless of whether you first clean.

The first two commits seem ok at a glance.

@radimkrcmar
Copy link
Author

The build system should properly rebuild with respect to all inputs, be it file, command line argument, or environment variable, but I haven't tested all possible cross compiling and whatnot, so I'm fine with having this applied locally.

@jrtc27
Copy link
Collaborator

jrtc27 commented Jan 11, 2025

The build system should properly rebuild with respect to all inputs, be it file, command line argument, or environment variable, but I haven't tested all possible cross compiling and whatnot, so I'm fine with having this applied locally.

Like most pure Makefile build systems, it doesn't, because there's no split between configure and build. The in-progress transition to CMake will resolve this (by instead using the configuration options for all builds when testing, and thus ensuring consistency, albeit in a different way).

@radimkrcmar
Copy link
Author

Ok, I will revisit this after the CMake switch.

Below are the times for current version without patches:

% make clean; time ./test/run_tests.sh  > /dev/null; touch model/riscv_sys_regs.sail; time ./test/run_tests.sh  > /dev/null
u=8:37.58 k=4:18.98 r=10:34.07 (122%) mm=657M pf=3+175676739 io=0/3115136 cs=377749+94815
u=8:37.48 k=4:18.55 r=10:31.19 (122%) mm=657M pf=0+175673989 io=0/3114400 cs=377670+95349

r= real (wall) time, u/k= cpu time in userspace/kernel space, mm= maximum memory occupation, pf= page faults from disk+from memory, io= input/output operations, cs= voluntary context switches+preepmtion.

With this Pull request:

% make clean; time ./test/run_tests.sh  > /dev/null; touch model/riscv_sys_regs.sail; time ./test/run_tests.sh  > /dev/null
u=6:12.30 k=1:30.85 r=5:20.64 (144%) mm=657M pf=2+58488620 io=0/2921120 cs=118174+33557
u=5:16.85 k=23.627 r=3:18.39 (171%) mm=657M pf=0+10699280 io=0/2844512 cs=12541+9213

The difference between 10 and 5 minutes for the first build will be noticeable even for CI. Subsequent builds save another two minutes.

I also skip the rvfi builds while developing, because it never caught anything, to save another minute and half

% touch model/riscv_sys_regs.sail; time ./test/run_tests.sh  > /dev/null
u=2:46.04 k=15.526 r=1:53.87 (159%) mm=657M pf=1+6302054 io=0/1891232 cs=7780+7589

and run the tests in parallel, to allow testing between every change

% touch model/riscv_sys_regs.sail; time ./test/run_tests.sh  > /dev/null
u=3:07.49 k=23.249 r=1:29.58 (235%) mm=657M pf=29+6307195 io=0/1890944 cs=32321+24269

because if the simulators are already compiled, the tests take just 6 seconds:

% time ./test/run_tests.sh  > /dev/null
u=40.141 k=14.439 r=5.748 (949%) mm=24M pf=29+1971850 io=0/1023032 cs=28823+18998

@Timmmm
Copy link
Collaborator

Timmmm commented Jan 12, 2025

Yeah I don't think there's any need for anything beyond switching to CMake. That can run all the tests in something like 5 seconds - most of the time is compilation, and that is also done in parallel.

Minor caveat on the 5 seconds: that's how long it takes in our fork, for some reason in my CMake PR it is more like 40 seconds, which is weird. It is using a completely different emulator but I can't think of any reason why there should be such a speed difference. I need to investigate that.

@radimkrcmar
Copy link
Author

It's ~30s for me if I don't run the tests in parallel. I have local patch to parallelize RV32 tests, wait, and parallelize RV64 tests. It could be ~3s if tests from both architectures ran in parallel as well, but that would be way more work.

@jrtc27
Copy link
Collaborator

jrtc27 commented Jan 12, 2025

Ok, I will revisit this after the CMake switch.

Below are the times for current version without patches:

% make clean; time ./test/run_tests.sh  > /dev/null; touch model/riscv_sys_regs.sail; time ./test/run_tests.sh  > /dev/null
u=8:37.58 k=4:18.98 r=10:34.07 (122%) mm=657M pf=3+175676739 io=0/3115136 cs=377749+94815
u=8:37.48 k=4:18.55 r=10:31.19 (122%) mm=657M pf=0+175673989 io=0/3114400 cs=377670+95349

r= real (wall) time, u/k= cpu time in userspace/kernel space, mm= maximum memory occupation, pf= page faults from disk+from memory, io= input/output operations, cs= voluntary context switches+preepmtion.

With this Pull request:

% make clean; time ./test/run_tests.sh  > /dev/null; touch model/riscv_sys_regs.sail; time ./test/run_tests.sh  > /dev/null
u=6:12.30 k=1:30.85 r=5:20.64 (144%) mm=657M pf=2+58488620 io=0/2921120 cs=118174+33557
u=5:16.85 k=23.627 r=3:18.39 (171%) mm=657M pf=0+10699280 io=0/2844512 cs=12541+9213

The difference between 10 and 5 minutes for the first build will be noticeable even for CI. Subsequent builds save another two minutes.

I also skip the rvfi builds while developing, because it never caught anything, to save another minute and half

% touch model/riscv_sys_regs.sail; time ./test/run_tests.sh  > /dev/null
u=2:46.04 k=15.526 r=1:53.87 (159%) mm=657M pf=1+6302054 io=0/1891232 cs=7780+7589

and run the tests in parallel, to allow testing between every change

% touch model/riscv_sys_regs.sail; time ./test/run_tests.sh  > /dev/null
u=3:07.49 k=23.249 r=1:29.58 (235%) mm=657M pf=29+6307195 io=0/1890944 cs=32321+24269

because if the simulators are already compiled, the tests take just 6 seconds:

% time ./test/run_tests.sh  > /dev/null
u=40.141 k=14.439 r=5.748 (949%) mm=24M pf=29+1971850 io=0/1023032 cs=28823+18998

The time saving isn’t from making the first build faster. If you’ve just done make clean then of course it’s going to be exactly the same as if the script did (minus the negligible time make clean itself takes). The saving comes from not building the softfloat libs each time. I wouldn’t be against having an opt-in option to skip that during clean that all but the first make clean in the test script enables, as that would indeed speed up CI.

@Timmmm
Copy link
Collaborator

Timmmm commented Jan 12, 2025

It could be ~3s if tests from both architectures ran in parallel as well, but that would be way more work.

CMake does this, and also doesn't rebuild softfloat every time.

@radimkrcmar
Copy link
Author

Obsoleted by CMake rework.

@Timmmm
Copy link
Collaborator

Timmmm commented Jan 13, 2025

Minor caveat on the 5 seconds: that's how long it takes in our fork, for some reason in my CMake PR it is more like 40 seconds, which is weird.

Ah this was because I was accidentally not running the -v- tests which use virtual memory. Those are much slower than the other tests, so 40 seconds is expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants