Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node readiness overhaul #223

Open
1 task done
cchudant opened this issue Aug 9, 2024 · 0 comments
Open
1 task done

Node readiness overhaul #223

cchudant opened this issue Aug 9, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@cchudant
Copy link
Member

cchudant commented Aug 9, 2024

Is there an existing issue?

  • I have searched the existing issues

Motivation

The /health endpoint on the node is a health check endpoint designed to work with kubernetes startup probe so that the node does not appear as available in k8s (i.e. not ready to answer rpc requests) when it is starting up

Returning anything other than 200 on this endpoint marks the node as not ready.

This is not a priority, but this feature would be important for node operators running multiple madara node in a cluster, so that a madara node that is restarting / starting for the first time does not get registered in the load balancer and doesn't receive rpc calls while it is starting.

Request

The current endpoint is always up at the same time rpcs go up

Solution

Readiness should be defined as:

  • in sync mode, show ready when the chain is fully synced by default (configurable via an arg)
  • in block production mode, we should wait for the first gas prices (see comment on feat(client): l1 gas price #219) to arrive before marking the node as ready

In any case, I'd like to have a generic struct like

pub type ServiceName = String;
struct ServiceReadiness {
  service: String,
  on_ready: oneshot::Receiver<()>,
}
pub struct ReadinessProbe(Arc<Mutex<Vec<Readiness>>>);
impl ReadinessProbe {
  fn register_service_readiness(service: ServiceName) -> oneshot::Sender<()> {}
  fn is_all_ready(&self) -> bool {}
}

this would be shared between the services, and the start function of each service would register their readiness condition.

Are you willing to help with this request?

Yes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

1 participant