-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Factor out health status checks to it's own service file #823
Factor out health status checks to it's own service file #823
Conversation
I hope you to review this. @allada , @MarcusSorealheis . |
b0f030e
to
c4b136b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 6 files at r1, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and 1 discussions need to be resolved
nativelink-service/src/health_server.rs
line 47 at r1 (raw file):
} pub async fn check_health_status(&self, json_content_type: &'static str) -> Response<String> {
I don't think there is value yet in passing json_content_type
since currently we are serializing only with serde_json5
. We could introduce a renderer/formatter/writer pattern here at some point, to keep things incremental I'd suggest not doing that yet and don't pass a parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and 3 discussions need to be resolved
nativelink-service/src/health_server.rs
line 1 at r1 (raw file):
// Copyright 2023 The NativeLink Authors. All rights reserved.
nit: 2024.
src/bin/nativelink.rs
line 412 at r1 (raw file):
spawn_blocking(move || { futures::executor::block_on( health_server_cloned.check_health_status(JSON_CONTENT_TYPE),
We want to make this similar to the other services and load them in in the same manner (config).
see the area in this file around .add_optional_service(
We want this health checker to implement that service api and it should be enabled based on the config.
c4b136b
to
ef125c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04, and 2 discussions need to be resolved
nativelink-service/src/health_server.rs
line 47 at r1 (raw file):
Previously, adam-singer (Adam Singer) wrote…
I don't think there is value yet in passing
json_content_type
since currently we are serializing only withserde_json5
. We could introduce a renderer/formatter/writer pattern here at some point, to keep things incremental I'd suggest not doing that yet and don't pass a parameter.
Done.
src/bin/nativelink.rs
line 412 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
We want to make this similar to the other services and load them in in the same manner (config).
see the area in this file around
.add_optional_service(
We want this health checker to implement that service api and it should be enabled based on the config.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: integration-tests (20.04), integration-tests (22.04), and 1 discussions need to be resolved
ef125c4
to
450bf36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 6 files at r1, 2 of 4 files at r2, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04, and 2 discussions need to be resolved
-- commits
line 2 at r2:
Please update this with:
[Breaking] Factor out health status checks to its own service
Health status checks are now a proper service and no longer enabled by
default.
Add `"health": {},` under your services config to enable.
nativelink-service/src/health_server.rs
line 30 at r2 (raw file):
#[derive(Clone)] pub struct HealthServer { health_registry_builder: Arc<AsyncMutex<HealthRegistryBuilder>>,
nit: This should not be part of this class. You probably want to bring in HealthRegistry
during construction. The builder should be kept outside of this class.
src/bin/nativelink.rs
line 412 at r1 (raw file):
Previously, blizzardc0der (Blizzardc0der) wrote…
Done.
You want to implement tower_service::Service
for HealthServer
then use .add_service
or .add_optional_service
. This will offload nearly all the complexity to that module.
d93f6c3
to
4a8f519
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r3.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: integration-tests (20.04), integration-tests (22.04), and 2 discussions need to be resolved
4a8f519
to
52dffb2
Compare
0dc7efe
to
caf1ed7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04, and 1 discussions need to be resolved
Previously, allada (Nathan (Blaise) Bruer) wrote…
Please update this with:
[Breaking] Factor out health status checks to its own service Health status checks are now a proper service and no longer enabled by default. Add `"health": {},` under your services config to enable.
Done.
src/bin/nativelink.rs
line 412 at r1 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
You want to implement
tower_service::Service
forHealthServer
then use.add_service
or.add_optional_service
. This will offload nearly all the complexity to that module.
It's currently derived from tower_service::Service
, although it can't be added by .add_service
or .add_optional_service
because they are functions of tonic::Service
which is GRPC server not Http Server.
So instead, I added that service by route_service
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r5, all commit messages.
Reviewable status: 2 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-22.04, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04, and 8 discussions need to be resolved
nativelink-service/src/health_server.rs
line 14 at r5 (raw file):
// See the License for the specific language governing permissions and // limitations under the License. use futures::StreamExt;
nit: for consistency new line above.
nativelink-service/src/health_server.rs
line 30 at r5 (raw file):
#[derive(Clone)] pub struct HealthServer { pub health_registry_status: HealthRegistry,
nit: This should not be public. If it's needed make an access function. Making it public allows anyone to update it if they have a it mutable.
nativelink-service/src/health_server.rs
line 34 at r5 (raw file):
impl HealthServer { pub async fn new(health_registry_status: HealthRegistry) -> Result<Self, Error> {
nit: just call all these health_registry
. It's confusing to call it _status
.
nativelink-service/src/health_server.rs
line 52 at r5 (raw file):
fn call(&mut self, _req: Request<Body>) -> Self::Future { let health_registry_status = self.health_registry_status.clone(); let fut = async move {
nit:
Pin::box(async move {
//...
})
nativelink-service/src/health_server.rs
line 53 at r5 (raw file):
let health_registry_status = self.health_registry_status.clone(); let fut = async move { let health_status_descriptions: Vec<HealthStatusDescription> = health_registry_status
nit: Move this above (outside this future, then we won't need to clone it (above).
nativelink-service/src/health_server.rs
line 83 at r5 (raw file):
.status(StatusCode::INTERNAL_SERVER_ERROR) .header( hyper::header::CONTENT_TYPE,
nit: Please import these and use them instead of listing out the namespaces.
nativelink-util/src/health_utils.rs
line 167 at r5 (raw file):
fn health_status_report( &self, ) -> Pin<Box<dyn Stream<Item = HealthStatusDescription> + '_ + Send>>;
nit: '_
should always be at end.
nativelink-util/src/health_utils.rs
line 175 at r5 (raw file):
fn health_status_report( &self, ) -> Pin<Box<dyn Stream<Item = HealthStatusDescription> + '_ + Send>> {
nit: ditto.
caf1ed7
to
0638fa2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04, and 1 discussions need to be resolved
nativelink-service/src/health_server.rs
line 53 at r5 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: Move this above (outside this future, then we won't need to clone it (above).
Just fyi, I've tried your solution but it didn't work.
tower::Service
trait's call
function is defined as synchronous function.
Although, the health_status_description
variable needs to determined in async
function.
So I added this line above using block_on
function to call async
function inside call
function.
let health_status_descriptions: Vec<HealthStatusDescription> = futures::executor::block_on(async move { self.health_registry.health_status_report().collect().await });
Although, when I tested it with Postman about /status
endpoint, it didn't work and hang.
So I didn't apply this changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r6, all commit messages.
Reviewable status: 2 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-22.04, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04, and 3 discussions need to be resolved
nativelink-service/Cargo.toml
line 14 at r6 (raw file):
nativelink-scheduler = { path = "../nativelink-scheduler" } async-lock = "3.3.0"
nit: unused.
nativelink-service/src/health_server.rs
line 53 at r5 (raw file):
Previously, blizzardc0der (Blizzardc0der) wrote…
Just fyi, I've tried your solution but it didn't work.
tower::Service
trait'scall
function is defined as synchronous function.
Although, thehealth_status_description
variable needs to determined inasync
function.
So I added this line above usingblock_on
function to callasync
function insidecall
function.
let health_status_descriptions: Vec<HealthStatusDescription> = futures::executor::block_on(async move { self.health_registry.health_status_report().collect().await });
Although, when I tested it with Postman about
/status
endpoint, it didn't work and hang.
So I didn't apply this changes.
ok, it's fine this way, but the reason you are probably fighting it is because of the future lifetime. You could bind it to the same lifetime as self
and it would probably be happy.
nativelink-service/src/health_server.rs
line 21 at r6 (raw file):
HealthRegistry, HealthStatus, HealthStatusDescription, HealthStatusReporter, }; use std::future::Future;
nit: std
's go first followed by a new line (see other files for example).
nativelink-service/src/health_server.rs
line 43 at r6 (raw file):
type Response = Response<hyper::Body>; type Error = std::convert::Infallible; type Future = Pin<Box<dyn Future<Output = Result<Self::Response, Self::Error>> + Send>>;
nit: It looks like there's some new syntax in rust you can use here to let the compiler know about the exact future:
impl Service<Request<hyper::Body>> for HealthServer {
type Response = Response<hyper::Body>;
type Error = std::convert::Infallible;
type Future = impl Future<Output = Result<Self::Response, Self::Error>> + Send>;
fn call(&mut self, _req: Request<Body>) -> impl Self::Future {
async move {
//...
}
}
}
see: https://rust-lang.github.io/async-fundamentals-initiative/evaluation/challenges/dyn_traits.html
0638fa2
to
eeecbf4
Compare
eeecbf4
to
486da3e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made CI passed.
Reviewable status: 2 of 1 LGTMs obtained, and pending CI: Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Remote / large-ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, ubuntu-20.04 / stable, ubuntu-22.04 / stable, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04, and 1 discussions need to be resolved
nativelink-service/src/health_server.rs
line 53 at r5 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
ok, it's fine this way, but the reason you are probably fighting it is because of the future lifetime. You could bind it to the same lifetime as
self
and it would probably be happy.
I've retried with the solution but couldn't make it work because of tower::Serivce
's declared Traits were strict and wasn't flexible.
To prevent from being obsessed with this, I'd like to leave it as for now and work on more urgent issues.
nativelink-service/src/health_server.rs
line 43 at r6 (raw file):
Previously, allada (Nathan (Blaise) Bruer) wrote…
nit: It looks like there's some new syntax in rust you can use here to let the compiler know about the exact future:
impl Service<Request<hyper::Body>> for HealthServer { type Response = Response<hyper::Body>; type Error = std::convert::Infallible; type Future = impl Future<Output = Result<Self::Response, Self::Error>> + Send>; fn call(&mut self, _req: Request<Body>) -> impl Self::Future { async move { //... } } }see: https://rust-lang.github.io/async-fundamentals-initiative/evaluation/challenges/dyn_traits.html
I've tried that solution but it didn't work and failed with compile errors.
Seems like tower::Service
trait's call
trait doesn't support that option to declare in that syntax.
Health status checks are now a proper service and no longer enabled by default. Add `"health": {},` under your services config to enable.
4e87cc8
to
2c901b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r6, 4 of 4 files at r7.
Reviewable status: 2 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Local / ubuntu-22.04, Publish image, Vercel, asan / ubuntu-22.04, integration-tests (20.04), pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, vale, and 1 discussions need to be resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r6.
Reviewable status: 2 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Local / ubuntu-22.04, Publish image, Vercel, asan / ubuntu-22.04, integration-tests (20.04), pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, vale, and 1 discussions need to be resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 13 files at r8, all commit messages.
Reviewable status: complete! 2 of 1 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r7.
Reviewable status: complete! 2 of 1 LGTMs obtained
…achina#823) Health status checks are now a proper service and no longer enabled by default. Add `"health": {},` under your services config to enable.
Description
Health status checks should be in it's own service called Health Service.
I made a new service for checking health status.
Fixes #643
Type of change
Checklist
bazel test //...
passes locallygit amend
see some docsThis change is