Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROVER-129: Removing Rayon & Crossbeam (Some Of It) #2081

Merged
merged 6 commits into from
Aug 30, 2024

Conversation

jonathanrainer
Copy link
Contributor

@jonathanrainer jonathanrainer commented Aug 28, 2024

Now that we've made the leap into the world of asynchronous execution we don't need to keep using Rayon thread pools when we could actually get the Tokio runtime to do it for us. This requires a bit of a fiddling to get right, but this should help us overcome the issues we've seen with the smoke tests, and in other places around thread exhaustion with larger numbers of subgraphs.

The refactor is somewhat limited in that we can't remove Crossbeam completely (yet), as it's pretty baked into how Leader/Follower works, however since a refactor is coming down the track there anyway I'm happy to leave that for now. I'd like to do some more manual testing of this against what Fidelity has and make sure we haven't broken anything else but I'm confident this will now work as we expect it to.

Successful smoke test run: https://github.com/apollographql/rover/actions/runs/10596517833/job/29365260226

Have run UAT over this myself and covered the following cases:

  • Using a supergraph.yaml with 8 subgraphs ranging from
    • 0 subgraphs with watched schema files and 8 with introspection to
    • 8 subgraphs with introspection and 0 with watched schema files
  • Using the Leader/Follower system to spin up 8 rover dev sessions with a mix of file watching and introspection
  • Using rover dev --graph-ref

All of these respond as we would expect! ✅

@jonathanrainer jonathanrainer requested a review from a team as a code owner August 28, 2024 12:18
@jonathanrainer jonathanrainer force-pushed the jr/story/ROVER-129 branch 2 times, most recently from f39bd39 to a2e7951 Compare August 28, 2024 12:41
Copy link
Contributor Author

@jonathanrainer jonathanrainer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes are pretty self-explanatory, just one extra note

@@ -102,15 +102,14 @@ interprocess = { version = "2", default-features = false }
indoc = "2"
lazycell = "1"
lazy_static = "1.4"
notify = "6"
notify = { version = "6", default-features = false, features = ["macos_kqueue"] }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is due to a known issue with notify: notify-rs/notify#380, we basically turn off crossbeam channels and switch to std ones instead

Comment on lines -123 to +132
subgraph_watchers.into_iter().for_each(|mut watcher| {
tokio::task::spawn(async move {
let _ = watcher
.watch_subgraph_for_changes(client_config.retry_period)
.await
.map_err(log_err_and_continue);
});
let futs = subgraph_watchers.into_iter().map(|mut watcher| async move {
let _ = watcher
.watch_subgraph_for_changes(client_config.retry_period)
.await
.map_err(log_err_and_continue);
});

subgraph_watcher_handle
.await
.expect("could not wait for subgraph watcher thread");
tokio::join!(join_all(futs), subgraph_watcher_handle.map(|_| ()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change stops us spawning tasks that don't actually do anything and will just end up blocking us, behaviour remains the same in testing

@jonathanrainer jonathanrainer force-pushed the jr/story/ROVER-129 branch 2 times, most recently from 360fb26 to b6adf58 Compare August 29, 2024 13:26
We're replacing with `tokio` constructs instead
Disable it for notify (key user), and remove it from rover-std
Because we're now using `tokio` channels they return None when they
are empty. So we can use a different idiom to read from them,
which ensures this continues to work.
This way our Introspection watching works as well
as watching schema files.
@jonathanrainer jonathanrainer added this to the v0.26.1 milestone Aug 30, 2024
@jonathanrainer jonathanrainer merged commit 79987dc into main Aug 30, 2024
22 checks passed
@jonathanrainer jonathanrainer deleted the jr/story/ROVER-129 branch August 30, 2024 08:33
@jonathanrainer jonathanrainer added the feature 🎉 new commands, flags, functionality, and improved error messages label Sep 4, 2024
@jonathanrainer jonathanrainer mentioned this pull request Sep 4, 2024
jonathanrainer added a commit that referenced this pull request Sep 4, 2024
# [0.26.1] - 2024-09-04

## 🚀 Features

- **Respect the use of `--output` flag in the supergraph binary -
@aaronArinder PR #2045**

In testing to attempt to reduce the runtime of `supergraph compose` we
noticed that a very large proportion of the time spent (in the case of
large supergraphs) was spent printing the result to `stdout`. With this
change we add an `--output` flag to the `supergraph` binary which means
this time can be reduced significantly, leading to much faster
compositions.

- **Add `--license` flag to `rover dev` - @loshz PR #2078**

Adds the ability to pass along an offline enterprise licence to the
router when running `rover dev`

- **Remove Rayon and reduce usage of Crossbeam - @jonathanrainer PR
#2081**
  
Now that `rover` has transitioned to using an asynchronous runtime we
don't need to use Rayon any more. This also resolves a bug whereby
`rover dev` could lock up if passed a `supergraph.yaml` file with lots
of subgraphs in.

- **Introduce new print macros - @loshz PR #2090**
  
Adds three new macros to the codebase so that we can still visually
distinguish between INFO, WARNING and ERROR log lines without the use of
emoji

- **Use new print macros in place of emoji - @loshz PR #2096**

Updates the locations that previously used emoji to utilise the new
macros defined in the previous PR

## 🐛 Fixes

- **Stop Windows Installer failing if whitespace is accidentally passed
to the `rover install` command - @jonathanrainer PR #1975**

In some situations it was possible for whitespace to be passed to the
`rover install` command which then caused the installer to fail. A guard
has now been added to strip whitespace out before it is passed to the
install command.

## 🛠 Maintenance

- **Move CI to using newly create Ubuntu images - @jonathanrainer PR
#2080**

CircleCI is removing support for older Ubuntu machine images, this
brings us up to date but does **not** change any of our `glibc` support
etc.

- **Add check for aarch-64-unknown-linux-musl to installers - @loshz PR
#2079**
- **Update node.js packages - @jonathanrainer PR #2070**

  Includes `eslint` to v9.9.1 and `node` to 20.17.0

- **Update `node` CircleCI orb to v5.3.0 - @jonathanrainer PR #2071**
- **Update `apollographql/federation-rs` to v2.9.0 - @jonathanrainer PR
#1983**
- **Update `apollographql/router` to v1.52.1 - @jonathanrainer PR
#2077**
- **Update `node` Docker Image to v20.17.0 - @jonathanrainer PR #2072**
- **Update `apollographql/router` to v1.53.0 - @jonathanrainer PR
#2084**
- **Update `npm` to v10.8.3 - @jonathanrainer PR #2091**
- **Update `slackapi/slack-github-action` to v1.27.0 - @jonathanrainer
PR #2092**
- **Update `node` CircleCI orb to v6.1.0 - @jonathanrainer PR #2093**
- **Fix some bugs in the smoke tests - @jonathanrainer PR #2094**

## 📚 Documentation

- **Add `cloud config` docs - @loshz PR #2066**
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature 🎉 new commands, flags, functionality, and improved error messages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants