Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add snapshot testing with insta #846

Merged
merged 11 commits into from
Aug 15, 2024
Merged

Conversation

suaviloquence
Copy link
Contributor

Sets up infrastructure for snapshot testing with insta. Example coming in subsequent PR.

Cargo.toml Outdated
@@ -50,6 +50,8 @@ anstream = "0.6.13"
assert_cmd = "2.0"
similar-asserts = { version = "1.5.0", features = ["serde"] }
predicates = "3.1.0"
insta = { version = "1.39.0", features = ["yaml"] }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I somewhat prefer ron over yaml since it's more explicit and ergonomic around Rust (ADT-style) enums. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine with me, I was going off of the docs:

This serializer is recommended because YAML is human readable and excellent at diffing because it is line based.

I think ron would work (is it more source control friendly?) as well.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately, let's pick whatever looks the best for the job. I love that you opened a PR with a specific test case where we can look at the exact generated output and decide based on that.

@@ -0,0 +1,75 @@
//! Snapshot tests of `cargo semver-checks` runs to ensure
Copy link
Owner

@obi1kenobi obi1kenobi Aug 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get away with keeping this inside a #[cfg(test)] mod tests within the crate itself, instead of as a separate module inside the top-level tests?

There's a substantial perf and ergonomics difference between the two: the former get compiled as part of the crate itself whereas the latter get compiled as a separate crate and are much slower to run and iterate on in my experience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "this" do you mean the tests themselves or just the infrastructure for snapshot testing?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole thing. If possible, ideally we wouldn't add anything to the top-level tests/ at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try but I think it depends on having the cargo-semver-checks binary already built, so I don't know if it would work.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we could attach to "the layer underneath" then the only thing we lose out on testing is making sure clap correctly parsed our arguments. That doesn't seem that valuable by comparison, especially if we get much better test perf and iteration speed.

If you don't think it's possible, I think this is fine to merge too. But it's worth a look at whether it's possible, just because the iteration speed would be different by multiples, not percentages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was wrong - it seems to work anyway, so I moved it to lib tests. Not sure if it will still be slower because it needs to build the binary? but if so I can do like you suggest.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems highly suspect.

insta-cmd's get_cargo_bin implementation uses the CARGO_BIN_EXE_<name> env var which is only set while building the binary or an integration test. I suspect the tests are falling through to the insta-cmd fallback path and reusing a prior (cached) binary which wouldn't be available in a clean cargo test run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that looks right, I deleted target/debug/cargo-semver-checks and it no longer works. I think we can't use insta_cmd then and we have to make our own diff structure using Check::check_release (but that may be better in the long run with the speedups and new API). Does that sound like a goal?

@obi1kenobi
Copy link
Owner

obi1kenobi commented Aug 3, 2024 via email

@suaviloquence
Copy link
Contributor Author

Reverted for now. I can try to work on a harness to lower it to lib tests, if it seems too complex maybe we can push that back for later.

@obi1kenobi
Copy link
Owner

The harness might not be worth it right now, so feel free to leave it for later if it isn't doable quickly and easily.

@suaviloquence
Copy link
Contributor Author

This is a rough draft of the harness: snapshot-lib-tests

@obi1kenobi
Copy link
Owner

That looks pretty decent, nice work! How do you feel about it?

@suaviloquence
Copy link
Contributor Author

I think it's mostly good, with a couple of disadvantages, but I think the speedup in development probably outweighs those. Things to consider:

I think the benefits probably outweigh the disadvantages though, so if you agree, I can port over the changes to this PR and move it to a lib-test.

@obi1kenobi
Copy link
Owner

One option to address some of those disadvantages is to make Check serializable (at least in cfg(test)), then have two tests:

  • An integration test that checks that a particular cargo semver-checks invocation produces the expected Check value.
  • A lib test that checks the outcome of running a Check order to completion on a given workspace.

This is how Trustfall's own tests work as well. The idea there is that we write a query string, then record all intermediate results: the parse tree, the compiler's intermediate representation (IR), the execution steps ("trace"), as well as the ultimate outputs of the query. Each test stage then takes an input (either the query string, or the output of a prior stage), runs the code under test, and checks that the produced output matches the expected (snapshotted) value.

This is a pretty nice pattern, because it would make it easy to write new tests while also eliminating the risk that we misconfigure the Check in a way that doesn't match real life. But it's definitely a bit of work to set up, so your call on whether you want to go all the way with the extra effort or save it for a future iteration.

@suaviloquence
Copy link
Contributor Author

That's a good idea! I'm not sure how that would look in implementation, though, since we are testing the Check serialization on the library side and the integration/CLI tests on the binary side. The only way I can think to do that right now is to duplicate the Check in both parts, and I'm not sure if insta could help with that. Is that what you were thinking of or is there a different way to do this?

@obi1kenobi
Copy link
Owner

obi1kenobi commented Aug 3, 2024 via email

@suaviloquence
Copy link
Contributor Author

Oh, if we moved the clap structs to lib.rs (and the Into<Check> impl), that would work. Is that what you meant?

@obi1kenobi
Copy link
Owner

obi1kenobi commented Aug 4, 2024 via email

@suaviloquence
Copy link
Contributor Author

Actually, do you know if the test speedup from lib tests also applies to binary tests? (that is, #[tests] inside the main.rs (or linked modules), not integration tests in the tests folder). If it does, I think it would be better to put them there. If not, we have to make the CLI part of the library's public API (or #[doc(hide)] it at least), which seems like not the best solution (but it could work).

@obi1kenobi
Copy link
Owner

obi1kenobi commented Aug 4, 2024 via email

@suaviloquence
Copy link
Contributor Author

suaviloquence commented Aug 4, 2024

Binary tests (timings from dev-desktop):

  • rebuild tests after library change: 13.171s
  • rebuild tests after binary change (in main.rs): 10.690s
  • rebuild tests after change in test file: 12.512s

Library tests:

  • rebuild tests after library change: 12.832s
  • rebuild tests after change in main.rs: 0s (no need to rebuild)
  • rebuild tests after change in test file: 13.242s

It looks like the timings are the same for library changes and test file changes, but for binary tests, we do have to rebuild the tests if the binary-side also changes (in main.rs).

The tradeoff for not having to spend that time is that we would have to expose our CLI types in the public API. (and I think we change the binary side less frequently than the library side) What do you think?

@@ -33,7 +34,7 @@ pub use query::{

/// Test a release for semver violations.
#[non_exhaustive]
#[derive(Debug, PartialEq, Eq)]
#[derive(Debug, PartialEq, Eq, Serialize)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One drawback to bin tests is that we need Serialize even in a non #[cfg(test)] environment. I don't know how big a deal that is, though.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that because Serialize is used by cargo-insta outside of cfg(test) somehow? Or is something else going on that I couldn't spot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're running the tests in the binary part of the crate, the library side is not compiled with #[cfg(test)] (when it's used as a dependency for this part), so it needs to be unconditionally Serialize.

Copy link
Owner

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really nice, well done! 🙌

Just a couple of minor questions, mostly from my unfamiliarity with insta, and I'm happy to merge this.

src/snapshot_tests.rs Outdated Show resolved Hide resolved
src/snapshot_tests.rs Show resolved Hide resolved
@@ -33,7 +34,7 @@ pub use query::{

/// Test a release for semver violations.
#[non_exhaustive]
#[derive(Debug, PartialEq, Eq)]
#[derive(Debug, PartialEq, Eq, Serialize)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that because Serialize is used by cargo-insta outside of cfg(test) somehow? Or is something else going on that I couldn't spot?

Comment on lines +1 to +41
//! Snapshot tests of `cargo semver-checks` runs to ensure
//! that we define how we handle edge cases.
//!
//! # Updating test output
//!
//! If you introduce changes into `cargo-semver-checks` that modify its behavior
//! so that these tests fail, the snapshot may need to be updated. After you've
//! determined that the new behavior is not a regression and the test should
//! be updated, run the following:
//!
//! `$ cargo insta review` (you may need to `cargo install cargo-insta`)
//!
//! If the changes are intended, to update the test, accept the new output
//! in the `cargo insta review` CLI. Make sure to commit the
//! `test_outputs/snapshot_tests/{name}.snap` file in your PR.
//!
//! We check **multiple stages** of the `cargo-semver-checks` execution of a given command
//! (currently two: the parsed input [`Check`](cargo_semver_checks::Check) and the output
//! of running [`check_release`](cargo_semver_checks::Check::check_release)). This means
//! you may have to run `cargo test` and `cargo insta review` multiple times when adding or
//! updating a new test if both the input and output change.
//!
//! Alternatively, if you can't use `cargo-insta`, review the changed files
//! in the `test_outputs/snapshot_test/ directory by moving `{name}.snap.new` to
//! `{name}.snap` to update the snapshot. To update all changed tests,
//! run `INSTA_UPDATE=always cargo test --bin cargo-semver-checks snapshot_tests`
//!
//! # Adding a new test
//!
//! To add a new test, typically you will want to use the [`assert_integration_test`] helper function
//! with a string invocation of `cargo semver-checks ...` (typically including the `--manifest-path`
//! and `--baseline-root` arguments to specify the location of the current/baseline test crate path:
//! in the `test_crates` directory if the test can use cached generated rustdoc files, or in the
//! `test_crates/manifest_tests` directory if the test relies on `Cargo.toml` manifest files).
//! Add a new function marked `#[test]` that calls [`assert_integration_test`] with
//! the prefix (usually the function name) and the arguments to `cargo semver-checks`
//!
//! Then run `cargo test --bin cargo-semver-checks snapshot_tests`. The new test should fail, as
//! there is no snapshot to compare to. Review the output with `cargo insta review`,
//! and accept it when the captured behavior is correct. (see above if you can't use
//! `cargo-insta`)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These docs are great! 💯

@obi1kenobi obi1kenobi enabled auto-merge (squash) August 15, 2024 20:25
Copy link
Owner

@obi1kenobi obi1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the merge conflict I caused with another PR, and this is good to go!

src/snapshot_tests.rs Outdated Show resolved Hide resolved
@obi1kenobi obi1kenobi merged commit 329c42f into obi1kenobi:main Aug 15, 2024
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants