Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate build from execution for integration tests #1352

Open
MiniaczQ opened this issue Aug 16, 2023 · 16 comments
Open

Separate build from execution for integration tests #1352

MiniaczQ opened this issue Aug 16, 2023 · 16 comments
Assignees

Comments

@MiniaczQ
Copy link
Contributor

Hello, I've got a question whether it's possible or maybe there exists a dedicated mechanic to building the instrumented binary and generating reports separately.

The use case would be integration tests where I'd like to separate building (requires cache) from running (requires special environment). It doesn't really matter for me if result interpretation is done directly after running or in a separate job, but this may be a good place to answer that too. :)

I've roughly looked over the available flags and tried to find issues, but nothing struck out.

@detly
Copy link

detly commented Aug 16, 2023

This would be good also just for the general practice of making sure you know what your runtime dependencies are vs. build time dependencies, without needing a separate set of build+test jobs in CI.

@xd009642
Copy link
Owner

So you could do:

cargo tarpaulin --no-run <OTHER ARGS>
// change args
cargo tarpaulin --skip-clean

And provided none of those args trigger a rebuild from build scripts etc, it should skip building again (this is at the compilers discretion not mine however). I'll look into doing a better workflow for this in some way though

@detly
Copy link

detly commented Aug 17, 2023

Thanks for that, I missed the significance of those in the docs. I thought they were more for "smoke" tests. I will give this a go. Note that the workflow is more like:

cargo tarpaulin --no-run <OTHER ARGS>
// - export from CI build container as artifacts
// - stop CI build container
// - start CI test container or physical test device
// - clone git repo into CI test container or device
// - copy artifacts to CI test container or device
cargo tarpaulin --skip-clean

I haven't confirmed this, but the things I am wary of are (a) could the updated timestamps on the source tree thwart the --skip-clean? and (b) will this run on a device that can't support the actual Rust toolchain? These are not tarpaulin specific though, but shared difficulties of trying to measure code coverage in a "realistic" deploy environment.

@xd009642
Copy link
Owner

so you still need cargo to get crate metadata so it does currently require the toolchain. I could potentially create some sort of recipe file describing the built artefacts with --no-run and then transfer the artefacts and the recipe and use an arg to get the things from that to solve the issue.

Tarpaulin does rely on having the source code available and the rust toolchain so it might take a bit of fiddling to get things to gel nicely. But I have no objections to working on adding this for the next release

@detly
Copy link

detly commented Aug 22, 2023

Don't take too much from my vague description there, I am still trying to make both ends of a CI pipeline meet in the middle so it's hard to describe any concrete issue I have at this point.

Fundamentally, though, I have two independent goals that relate to build/test separation:

  1. Ensuring that runtime dependencies are known. If tests are run in the same environment as the build, then it's easy to miss that eg. libblahblah5 must be installed, because you had blahblah3-dev installed in order to build. This is especially problematic if there's any kind of dynamic change in behaviour or feature detection at play; you need to run your tests to exercise those code paths. (It is also important going the other way - answering the question "what system dependencies can I remove?")
  2. Running tests on a target system that can't be used for building. I deploy things on embedded Linux systems with limited space and CPU power. I want to test on there too, and look at coverage and results from running those tests. I can circumvent the space issue to some extent by connecting USB or network storage, so there's room for generated coverage stats. I could even put the source tree in there too. But running cargo or rustup or rustc is not viable.

I've been messing around with cargo-llvm-cov and grcov and cargo-nextest, but have yet to actually implement anything that works. Honestly, ideally, all I want are two things:

  1. Build a binary that's instrumented to generate coverage data but works the same way as any other build of that binary. I think I get this out of the box purely from RUSTFLAGS='-C instrument-coverage', but I won't know until it all works.
  2. Documentation of how to merge that data from different runs: what files to copy, what command to run to merge them, how to deal with differing paths, etc. This is by far the hardest part. Nothing I've tried has actually worked so far.

At the moment I think tools like tarpaulin and cargo-llvm-cov operate at a higher level than this, which is fine. But I don't know enough to say whether it would be a better use of my time to hack on such tools to facilitate what I want, or to build up a system from the lower level elements. Hence the vagueness.

@xd009642
Copy link
Owner

Ah so for embedded targets I did explore this a bit in https://github.com/xd009642/llvm-embedded-coverage but that was more for embedded no-std targets. That said the binary bloat from llvm coverage ELF sections meant I was struggling to fit a hello world like test onto my target (it just about fit iirc). And the probe-rs libraries for sending data over JTAG for debug logs etc didn't seem to guarantee data would be sent/received which complicated getting the profiling stats out.

For those type of systems my line of thinking was using probe-run which is more like a breakpoint based coverage to insert breakpoints and step through the tests - no additional bloat to the binary beyond debug info and no-std targets are usually single threaded anyway. But I hadn't explored further because there hasn't been much demand.

If you wanted to hack on it RUSTFLAGS='-C instrument-coverage' does give you what you want. You can do what me/cargo-llvm-cov do and use cargo-metadata to get the list of binaries output for your test build command. When you run each one on your target it should generate a .profraw file in the root which you copy back to the host system. And then generally follow these steps https://doc.rust-lang.org/rustc/instrument-coverage.html

@detly
Copy link

detly commented Aug 22, 2023

Oh a bare-metal no-std deployment is (thankfully) way beyond what I want to measure coverage for. Think OpenWRT/router style platforms, that's closer to what I'm using.

After reading a tonne of llvm-cov docs and the chapter of the rustc book you linked, I think I can do something with a hybrid of Nextest's archive feature and grcov to simplify those two phases while keeping them independent.

@MiniaczQ
Copy link
Contributor Author

I've read through the rust article about coverage and some of the source code.
Would it be unreasonable to separate tarpaulin into lesser CLI apps under the cover?

A logical separation would have:

  • build app - it would require the full toolchain and would build all the test binaries,
  • demangle app - it would require profraws and files for demangling; it would require llvm tools, but I don't think Rust toolchain is required here,
  • format app - it would require the demangled output, besides that it's self sufficient

On the surface it can still be combined as tarpaulin, but the separate CLIs can also be available by themselves.

Custom test runners like nextest could then be inserted between building and demangling.
Going from integrating custom test runners to matching IO interfaces would be a massive simplification.

For custom Docker images, it'd be beneficial to have build and demangle as standalones, with format possibly expanding upon pycobertura image and running together.

Let me know how this sounds, there's no doubt I missed some details, but I think the general idea sounds promising.

@xd009642
Copy link
Owner

First note, point 2 doesn't require the llvm build tools if done via tarpaulin because we implemented our own profraw parsing to avoid the faff of installing extra binaries. I would probably just change that one into running tests.

There is a bit of faff with nextest I have to figure out, it can't work with the ptrace backend and llvm coverage has issues that stop it working for certain project types. For a start building things and packaging up the test executable and a serialized json or something to give some metadata about the tarpaulin settings etc and then giving tarpaulin a way to take that and run it all would be maybe simpler to implement and get in

@MiniaczQ
Copy link
Contributor Author

If profraw parsing is integrated then yeah, step 2 could be inside step 3 and step 2 would run tests.

Later today I'll try to follow the logic through source code and see where each stage meets

@MiniaczQ
Copy link
Contributor Author

Is there any particular reason why the args use a builder pattern instead of derives?
I can see that this 70-line monster was built over 6 years, so maybe it's just w/e was available back then.

Separating the args would be a good first step for me to try and figure this out.
Struct based args would also remove the temporal coupling. In general much cleaner transition from possible input to a config class.

@xd009642
Copy link
Owner

It was initially written before clap_derive existed - and also before structopt had it's first proper release. So largely an act of legacy.

When clap 3 came out I tried to move over to it but I found areas where it seemed I couldn't just simply derive a struct with the same behaviour without changing my config struct. And my config struct is used for serializing config files so there's an element of backwards compatibility that's caused inertia in that respect.

I will happily accept a PR which moves it to modern clap provided it doesn't break anything for existing users - but I've largely prioritised other features or upgrades (the syn 2 upgrade is one I've been struggling on which would be a bigger win for perf/compile-time)

@jlevon
Copy link

jlevon commented Sep 27, 2023

So you could do:

cargo tarpaulin --no-run <OTHER ARGS>
// change args
cargo tarpaulin --skip-clean

And provided none of those args trigger a rebuild from build scripts etc, it should skip building again (this is at the compilers discretion not mine however). I'll look into doing a better workflow for this in some way though

I'm confused by what "change args" means here. What I need to do in order to use this is basically: build the rust daemon, run some python tests that use that built daemon, report coverage. Ideally be able to combine with "cargo test". I've tried lots of variants of this but it appears cargo tarpaulin only works by invoking the target itself?

@MiniaczQ
Copy link
Contributor Author

I think an example problem case would be:

cargo tarpaulin --no-run --bin project1
# [...]
cargo tarpaulin --skip-clean --bin project2

You can mismatch many more options but simply using the wrong binary would fail.

What I believe xd009642 wants is to prevent that, most likely by providing some sort of test 'descriptor' that gets generated during the build and will provide subsequent steps with configuration that matches the build step.
I'd probably lean more towards validation rather than provision, since you can build all your test binaries at once and then separate their usage.

I'll try to find some time on the weekend to split up configuration for each step, since we need to know the common elements between each stage.

@jlevon
Copy link

jlevon commented Sep 29, 2023

Oh, OK, so that comment was about what could go wrong, not a recipe I could follow to do what I want. I think you're saying my use case is not possible with tarpaulin: that is, it's not possible to run a built binary separatelyand subsequently collect coverage from that run with tarpaulin.

@MiniaczQ
Copy link
Contributor Author

Not right now at least, this issue is about allowing that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants