-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Testing in VS Code #107467
Comments
👋 I'm the maintainer of the Ruby Test Adapter extension and just wanted to drop in and give a bit of context on the pain points I've had with the Test Adapter API. The Adapter API works well for the most part, but the biggest problem I've had with it is getting it work well with large codebases. I developed the Ruby adapter with my Rails app vglist in mind, and at the time it had around 300-400 tests in total (it now has 700-800). My adapter worked fine with that, it took a few seconds (maybe 5-10) to reload all the test suite data whenever a change was made to the test code. However, even 5-10 seconds is annoying to wait for when you just want to run your tests right after you modify them. And now I work at a company with a Rails app that has over 12,000 RSpec tests, so my extension now takes about 45 seconds to reload the tests whenever I change them (and it'd be even longer if I didn't have a pretty well-spec'd laptop). I don't ever really use my test adapter with that codebase because it slows me down far too much. The problem is that the API doesn't support modifying the existing set of tests, only replacing all of them at once. Which requires RSpec to generate information for every test in the codebase whenever it reloads the tests. I've considered a workaround where I'd cache the information in a tempfile or something, and have rspec only load information about changed files, and then modify the cached tree and send it up to the explorer UI, but that's complex and I haven't gotten around to actually doing it. There are other problems - which I'm not really sure how to solve - with being unable to run the tests while it's loading the test suite because it doesn't handle the tests that pass while it's loading (the data can theoretically get out-of-sync if you, say, change a bunch of things between reloads, but >99% of the tests in the codebase will be exactly the same before and after any given change I make to the suite). So it's maybe worth considering that when designing the invalidation logic. There are also problems with being able to view test logs to figure out issues when tests fail, but that's more of a problem with how the RSpec framework works than with anything the Test Adapter API does. The current API Sketch looks quite good from what I'm seeing. I especially like the use of a cancellation token, which if I understand correctly seems like a much cleaner solution than the Adapter API where you just blindly attempt to kill any relevant running processes. There are some things missing there (e.g. how do I handle the case where the suite is unloadable because the user has created an invalid test suite via some syntax error?), but overall it looks like it takes a lot of the good decisions from the Adapter extension API. Anyway, I hope this makes some sense. I understand it's probably not a super cohesive comment since I wrote it on-the-fly, but I've been hoping something like this would be introduced into the core of VS Code eventually, so I wanted to make sure I got my two cents in :) |
Thank you for that feedback, it's super helpful! Scoping tests and handling large codebases is definitely something we want to tackle from the outset. Like all features we build, we will be dogfooding it in the VS Code repo, so it is a priority to be able to handle that. As you saw in the sketch, I think the collection approach solves the pain points around having to replace all tests. The idea of running tests during load is interesting. The base case of running all tests is pretty simple from the API perspective -- we would just One scenario that isn't covered is running a subset of tests as discovery is happening (e.g. if I wanted to run all tests with the word "foo" in them) without starting a new test run. This would take some finessing and will not be supported by all adapters, for instance Go could never support it without changes to its toolchain, so is probably not worth designing for...
I think the right thing to do would be for the test extension to emit a diagnostic for that using our existing API. I've added it to the open questions as well though for discussion. |
@connor4312 generally speaking I'd want to give the user an error notification so they know something has broken and their suite failed to load, rather than creating a diagnostic. But as long as the extension can choose what to do if that occurs, I suppose the Test API in VS Code doesn't need to care about that. |
When I say diagnostic I mean the "Diagnostic" in the VS Code API that appears in the problems view (and can also have an error squiggle in the editor) The downside there is that if they have a syntax error, having the problem be duplicated by both the language server and test extension will be some ugly clutter. |
The reason I wouldn't use a diagnostic is that 1) I personally rarely ever look at diagnostics/errors in the diagnostics view because the signal to noise ratio in my experience is so high, and 2) there are situations where a diagnostic wouldn't really make sense, for example if the user's tests fail to load because their local Postgres server isn't running or because they forgot to install their Ruby gems so the RSpec gem isn't even available. |
Some thoughts from my side (I'm the author of the Test Explorer UI extension):
|
Thanks for the mention @hbenl. Looks good to me, I also agree with @connorshea about the notificantion in case something is broken. I had that a lot and having a possibility to tell the users what went wrong might be good. One thing that I was missing, and I am not sure if it possible, is to match my tests with the source files. My adapter runs C code and I can't simply grep all files for the function names. I could do that if the user registered the toolchain somehow but that would require a lot of setup for the user to make prior to testing. So it would be nice to somehow get cross information from the language server and check which files in my project are actually test files. Not sure if that is meant by: |
One more thought: the central difference between tests and suites in Test Explorer is that the state of a suite is always computed from the states of its children (with one exception: a suite can be marked as errored). |
Hello I'm the author of C++ TestMate
Praise the initiative. The feature will be useful for Catch2 too. (related issue)
I think only language server isn't enough. We need the flexibility to retire tests in case of external dependencies and file changes. A request of my user: About test state propagation: But this raises another question. Scenario: B and C under A. Test B reports failure and test B still running. What should be the state of A? According to previous precedence it should be "Running" but users might would find it useful to see that some tests part of that "Suite"/node has already failed. I have some ideas about it but all of the sacrifices the simplicity and clearness of the API. Question: The integrated test explorer what extra features will be available for VSCode users? I mean we have a working system now thanks to @hbenl. What is strongest the motivation of the integration? |
Thanks for the mention @hbenl In my abapfs extension I only know I have a test after running it, a discover function would take weeks to run. Proposed API looks ok as long as I can fire onDidChangeTest any time |
/cc @connectdotz who has been doing great work on vscode-jest |
Thank you for the feedback and the tags! Point taken around the error notifications. This week is endgame for our September release, but we'll continue discussion and update this issue + the sketch next week.
@hbenl You bring up a good point that, in the current design, state is never inherited. Generally for tests we want the opposite -- that state is always inherited -- but thinking about this more it's only applicable within a tree UI. Perhaps if the state is unset, then we say UI implementations should do the right thing automatically. For example, a test explorer would show parents as failed if any of their children failed, or running if any of their children are running. But a status bar item that shows failing/passing tests would only count those who had a non-unset status. I think this approach would also make @matepek's scenario nice from an API point of view; the UI can be arbitrarily smart.
@bneumann I think this problem is mostly out of scope. We will provide hooks in VS Code to build tests, but won't add any special new discovery mechanisms.
@matepek Definitely, the language server won't be the only way (we aren't going to require test providers to also integrate their language server) but we should not block that path.
@matepek Yes, test output and diffing is one particular area we want to improve. Thank you for mentioning the streaming scenario, I'll make sure we look into that and will tag you with the outcome/proposal.
We actually are not set yet on making this integrated. Our roadmap outlines the purpose here (which I should add to the original thread for clarity):
Some benefits of being built-in, mentioned briefly in the table in the original issue, are:
At the moment we're sketching an extension host API to provider an understanding of the 'ideal world' API. This may become a built in API, it could be a protocol, or we could end up working with Holger to apply some principles the test explorer UI. |
hi, I work with @orta in vscode-jest. This is an interesting thread, have a few questions:
|
hey all, I maintain the Jest Extension for the @hbenl's Test Explorer UI. Here is a bit of a brain-dump, some of which might be useful: I agree with @connectdotz comments about the structure of Jest tests being very hierarchical and top-down. I maintain internal state in my extension as a tree and map from the internal structure to the Test UI structure when I need to emit events as the internal structure is highly correlated to the output. IMHO the test collection as a mutable list-like collection and mutable test objects would be more difficult to translate to/from than what I currently do. Now it may be that the Test UI extension can live on as an intermediary extension taking care of this translation on our behalf (amongst other tasks e.g. multiple workspaces). But I think it would be worth taking a little time to think about how the test information will be consumed by UI extensions as noted in the todo. Maybe leave this API in draft until the UI API has been fleshed out. It doesn't make sense to flatten it, pass it to VS Code to pass it to a UI extension where it is transformed back into tree form again. Other extension authors should comment on what their preferred format is. The proposed API describes how the
Test Messages:
|
Also maybe a category for testing could be useful. |
@connor4312 big congratulations on finally shipping this! :D Absolutely fantastic work over the last year :) Thank you for taking all our feedback into consideration and working to build the compatibility layer! |
@connor4312 yes absolutely fantastic work and for being so responsive to a bunch of heathens you don't even work for :) |
🎉Congratulate to the great work! @connor4312 I believe this will definitely make VS Code more powerful in the polyglot development. People can get a unified experience on testing. Awesome! |
Thanks everyone for your feedback and suggestions to make this happen. Keep filing those Github issues if you run into problems or have questions 😉 |
Fyi for extension authors, there is a new marketplace category "Testing" which you can publish extensions into for discoverability 🙂 |
Although this thread is closed, maybe someone can answer this: Now I tested following situation: If this has happened once I'm also not able anymore to run a second test case since my extension is stuck. How do I have to deal with such a situation? |
The approach we've taken with the API is to let test extensions handle configuration themselves, since in most cases test frameworks are already configured with external test files and we don't want to duplicate things or have multiple points-of-truth. If this isn't the case for yours, you're welcome to have a custom configuration file or just workspaceStorage, which you could consider using in concert with the
Without knowing the specifics of your case, usually the test runner signals the test case failed after some time, and you can then put the test case into the errored or failed state. |
Ok, something different. I would rather like to do this not automatically. I.e. I would like to have the user press a button to discover test cases. |
@maziac what you could do is basically ignore the resolvehandler and supply your own button to the menu. This would be pretty unintuitive. Why wouldn't you want to auto discover test cases? The discovery doesn't start until someone manually chooses to look at the test window (it doesn't happen just on startup every time) You could also just wire into the runhandler to do your discovery and populate the tree right before running tests, so when someone clicks "run all" the tests just start showing up. |
This should definitely be supported by the new testing API. While automatic discovery of changes to the tests would be ideal in general, there are many situations where triggering the test discovery manually is necessary (e.g. when loading the tests is very slow so it shouldn't be triggered automatically, or when automatic discovery of changes is not possible or its implementation is buggy). |
What you could do is have your "initialization" resolve handler (when the resolver is called with an undefined testitem indicating it's a fresh start) to register a default testItem (suite) in your test controller root, e.g. "MyTests". Then your runhandler on that testItem does the discovery and run at the same time and creates the test items with their results as needed. |
That's basically what my Pester one does, it uses the filewatcher to create the test suite "entries" on a per file basis very quickly, but then discovery of the tests doesn't happen until you dropdown the item or click run. You'll need to do something special for list view though because it auto-expands the test entries by default. |
Currently there's no "refresh" button but if we add one it'll likely result in a second call to the
👍 listening to the resolveHandler is the recommended approach. We call it conservatively, and trying to do some custom loading without listening to the resolveHandler will break some things, like the "re-run" commands after an reditor reload. |
If you search for "go.test.refresh" in vscode-go, you'll see my implementation of refresh. All you need is:
If you set |
Sorry, but all this seems a little hacky-ish to me. |
It sounds like adding a long debounce on changes to the main file is sensible in your case. |
That's not necessarily true, if you don't set up a OnDocumentChanged handler, this won't happen. Maybe you just want to set up a file changed watcher so it only discovers tests when the main file is saved? |
@maziac If you want to update the test items when the document changes (OnDocumentChanged), debouncing can somehow mitigate your problem, and you can even make the debouncing smarter by making it adaptive, see an example from Java Test Runner. It changes the debouncing time according to the time used to resolve the test items in the history (actually it "copys" the logic when VS Code send the request for outline) @connor4312 I'd like to bring this topic out again: do you have any plan to support cancellation token in the editing scenario? Like the |
Another scenario: multiroot work space. |
@maziac you can do it however you want, it's called per test controller with undefined testItem initially, then it's called for the specific test controller that owns the item in question for discovery. This is easy to see by just setting a breakpoint on the resolveHandler to watch it work. I personally have one test controller that on an empty resolvehandler gets the list of workspace roots and then foreach's through them to create file watchers, all tests for all workspaces tied to a single controller. There's advantages and disadvantages to the appraoch, for me the A's outweigh the D's for pester. |
Another question: But I would like to suppress the possibility to choose 'debug' for the test suites and allow it only for test items. Only for the "real" test cases I would like to allow the 'debug' option. Is there any way to do that? |
State of the World
Testing support in VS Code has been a feature request for a long time. The VS Code community has build excellent extensions around testing, for example:
Each implementation of testing presents a different set of features, UI, and idiomaticity. Because there is no sanctioned approach to tests in VS Code, extension developers tend to make bespoke implementations, as we've seen in the Python and Java language extensions. Ideally, like in debugging, a VS Code user would have just about the same experience as they work between projects and languages.
VS Code's Approach
The Test Explorer UI presents the best point of inspiration for us, as there are many existing extensions built on its API: it's capable and proven. Regardless of the direction we take in VS Code, we should have a way for its Test Adapters to be upgraded to the new world.
Wallaby is an excellent extension, but it's tailored and purpose-built to JavaScript, and includes functionality which is not readily portable to other languages. While it is a good source for inspiration, we're not aiming to encompass Wallaby's feature set in the extension points we provide, at least not yet.
We're prototyping an API in the extension host, but there are a number of approaches we can take:
+ Easier to manage state
+ Clear way to build 'official' test extensions
+ Could be theoretically shared with VS and other editors
+ Unclear whether there's significant functionality we'd want that's not already possible in exthost api
- Less friendly, additional complexity than TS APIs for extension authors
- Less clear there's an official pathway for test extensions
API Design
The following is a working draft of an API design. It should not be considered final, or anything close to final. This post will be edited as it evolves.
Changes versus the Test Adapter API
As mentioned, the test adapter API and this one provide a similar end user experience. Here are the notable changes we made:
The test adapter API does not distinguish between watching a workspace and watching a file. In some cases, there is an existing process that reads workspace tests (such as a language server in Java) or it's not much more expensive to get workspace tests than file tests (such as mocha, perhaps). However, some cases, like Go, providing tests for a single file can be done very cheaply and efficiently without needing to involve the workspace.
In this API we expect the
TestProvider
to, after activation, always provide tests for the visible text editors, and we only request tests for the entire workspace when required (i.e. when the UI needs to enumerate them).We have modeled the test state more closely after the existing
DiagnosticCollection
, where the Test Adapter API uses only events to enumerate tests and does not have a central collection.The Test Adapter API makes the distinction between suites and tests, we do not. They have almost identical capabilities, and in at least one scenario the 'suites' are more like tests and the leaf 'tests' cannot be run individually.
We use object identity rather than ID for referencing tests. This is in line with other items in the VS Code API, including Diagnostics.
Ideas and Open Questions
See the
testing
label for current work, questions, and problems.API
See the current working proposal in https://github.com/microsoft/vscode/blob/master/src/vs/vscode.proposed.d.ts (ctrl+f for 107467)
The text was updated successfully, but these errors were encountered: