-
-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add snapshot testing with insta
#846
Conversation
Cargo.toml
Outdated
@@ -50,6 +50,8 @@ anstream = "0.6.13" | |||
assert_cmd = "2.0" | |||
similar-asserts = { version = "1.5.0", features = ["serde"] } | |||
predicates = "3.1.0" | |||
insta = { version = "1.39.0", features = ["yaml"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I somewhat prefer ron
over yaml
since it's more explicit and ergonomic around Rust (ADT-style) enums. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine with me, I was going off of the docs:
This serializer is recommended because YAML is human readable and excellent at diffing because it is line based.
I think ron would work (is it more source control friendly?) as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, let's pick whatever looks the best for the job. I love that you opened a PR with a specific test case where we can look at the exact generated output and decide based on that.
tests/snapshot_tests.rs
Outdated
@@ -0,0 +1,75 @@ | |||
//! Snapshot tests of `cargo semver-checks` runs to ensure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we get away with keeping this inside a #[cfg(test)] mod tests
within the crate itself, instead of as a separate module inside the top-level tests
?
There's a substantial perf and ergonomics difference between the two: the former get compiled as part of the crate itself whereas the latter get compiled as a separate crate and are much slower to run and iterate on in my experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "this" do you mean the tests themselves or just the infrastructure for snapshot testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole thing. If possible, ideally we wouldn't add anything to the top-level tests/
at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can try but I think it depends on having the cargo-semver-checks
binary already built, so I don't know if it would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we could attach to "the layer underneath" then the only thing we lose out on testing is making sure clap correctly parsed our arguments. That doesn't seem that valuable by comparison, especially if we get much better test perf and iteration speed.
If you don't think it's possible, I think this is fine to merge too. But it's worth a look at whether it's possible, just because the iteration speed would be different by multiples, not percentages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was wrong - it seems to work anyway, so I moved it to lib tests. Not sure if it will still be slower because it needs to build the binary? but if so I can do like you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems highly suspect.
insta-cmd
's get_cargo_bin
implementation uses the CARGO_BIN_EXE_<name>
env var which is only set while building the binary or an integration test. I suspect the tests are falling through to the insta-cmd
fallback path and reusing a prior (cached) binary which wouldn't be available in a clean cargo test
run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that looks right, I deleted target/debug/cargo-semver-checks
and it no longer works. I think we can't use insta_cmd
then and we have to make our own diff structure using Check::check_release
(but that may be better in the long run with the speedups and new API). Does that sound like a goal?
Yeah, fair enough.
Better to have good integration tests now than perfect tests much later. We can refactor and improve things down the line.
Thanks for checking and making sure!
|
Reverted for now. I can try to work on a harness to lower it to lib tests, if it seems too complex maybe we can push that back for later. |
The harness might not be worth it right now, so feel free to leave it for later if it isn't doable quickly and easily. |
This is a rough draft of the harness: snapshot-lib-tests |
That looks pretty decent, nice work! How do you feel about it? |
I think it's mostly good, with a couple of disadvantages, but I think the speedup in development probably outweighs those. Things to consider:
I think the benefits probably outweigh the disadvantages though, so if you agree, I can port over the changes to this PR and move it to a lib-test. |
One option to address some of those disadvantages is to make
This is how Trustfall's own tests work as well. The idea there is that we write a query string, then record all intermediate results: the parse tree, the compiler's intermediate representation (IR), the execution steps ("trace"), as well as the ultimate outputs of the query. Each test stage then takes an input (either the query string, or the output of a prior stage), runs the code under test, and checks that the produced output matches the expected (snapshotted) value. This is a pretty nice pattern, because it would make it easy to write new tests while also eliminating the risk that we misconfigure the |
That's a good idea! I'm not sure how that would look in implementation, though, since we are testing the |
Not sure what you mean, could you elaborate?
Message ID: ***@***.***
com>
What I was proposing might require moving a few types from the main.rs to
the lib.rs, which should be fine to do.
|
Oh, if we moved the clap structs to |
Yeah, let's do that for now.
Eventually we might want one more lib in between, but that's a later
problem :)
|
Actually, do you know if the test speedup from lib tests also applies to binary tests? (that is, |
I'm not sure I've ever tried putting tests in main.rs files. Try it and
see if it works — I trust your judgment!
|
Binary tests (timings from dev-desktop):
Library tests:
It looks like the timings are the same for library changes and test file changes, but for binary tests, we do have to rebuild the tests if the binary-side also changes (in main.rs). The tradeoff for not having to spend that time is that we would have to expose our CLI types in the public API. (and I think we change the binary side less frequently than the library side) What do you think? |
43a4760
to
8126faa
Compare
8126faa
to
77c38b3
Compare
@@ -33,7 +34,7 @@ pub use query::{ | |||
|
|||
/// Test a release for semver violations. | |||
#[non_exhaustive] | |||
#[derive(Debug, PartialEq, Eq)] | |||
#[derive(Debug, PartialEq, Eq, Serialize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One drawback to bin tests is that we need Serialize
even in a non #[cfg(test)]
environment. I don't know how big a deal that is, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that because Serialize
is used by cargo-insta
outside of cfg(test)
somehow? Or is something else going on that I couldn't spot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're running the tests in the binary part of the crate, the library side is not compiled with #[cfg(test)]
(when it's used as a dependency for this part), so it needs to be unconditionally Serialize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really nice, well done! 🙌
Just a couple of minor questions, mostly from my unfamiliarity with insta
, and I'm happy to merge this.
@@ -33,7 +34,7 @@ pub use query::{ | |||
|
|||
/// Test a release for semver violations. | |||
#[non_exhaustive] | |||
#[derive(Debug, PartialEq, Eq)] | |||
#[derive(Debug, PartialEq, Eq, Serialize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that because Serialize
is used by cargo-insta
outside of cfg(test)
somehow? Or is something else going on that I couldn't spot?
//! Snapshot tests of `cargo semver-checks` runs to ensure | ||
//! that we define how we handle edge cases. | ||
//! | ||
//! # Updating test output | ||
//! | ||
//! If you introduce changes into `cargo-semver-checks` that modify its behavior | ||
//! so that these tests fail, the snapshot may need to be updated. After you've | ||
//! determined that the new behavior is not a regression and the test should | ||
//! be updated, run the following: | ||
//! | ||
//! `$ cargo insta review` (you may need to `cargo install cargo-insta`) | ||
//! | ||
//! If the changes are intended, to update the test, accept the new output | ||
//! in the `cargo insta review` CLI. Make sure to commit the | ||
//! `test_outputs/snapshot_tests/{name}.snap` file in your PR. | ||
//! | ||
//! We check **multiple stages** of the `cargo-semver-checks` execution of a given command | ||
//! (currently two: the parsed input [`Check`](cargo_semver_checks::Check) and the output | ||
//! of running [`check_release`](cargo_semver_checks::Check::check_release)). This means | ||
//! you may have to run `cargo test` and `cargo insta review` multiple times when adding or | ||
//! updating a new test if both the input and output change. | ||
//! | ||
//! Alternatively, if you can't use `cargo-insta`, review the changed files | ||
//! in the `test_outputs/snapshot_test/ directory by moving `{name}.snap.new` to | ||
//! `{name}.snap` to update the snapshot. To update all changed tests, | ||
//! run `INSTA_UPDATE=always cargo test --bin cargo-semver-checks snapshot_tests` | ||
//! | ||
//! # Adding a new test | ||
//! | ||
//! To add a new test, typically you will want to use the [`assert_integration_test`] helper function | ||
//! with a string invocation of `cargo semver-checks ...` (typically including the `--manifest-path` | ||
//! and `--baseline-root` arguments to specify the location of the current/baseline test crate path: | ||
//! in the `test_crates` directory if the test can use cached generated rustdoc files, or in the | ||
//! `test_crates/manifest_tests` directory if the test relies on `Cargo.toml` manifest files). | ||
//! Add a new function marked `#[test]` that calls [`assert_integration_test`] with | ||
//! the prefix (usually the function name) and the arguments to `cargo semver-checks` | ||
//! | ||
//! Then run `cargo test --bin cargo-semver-checks snapshot_tests`. The new test should fail, as | ||
//! there is no snapshot to compare to. Review the output with `cargo insta review`, | ||
//! and accept it when the captured behavior is correct. (see above if you can't use | ||
//! `cargo-insta`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These docs are great! 💯
Co-authored-by: Predrag Gruevski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the merge conflict I caused with another PR, and this is good to go!
Sets up infrastructure for snapshot testing with
insta
. Example coming in subsequent PR.