-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add new simple announce node manager and DNS node manager #26119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e609de6 to
a2ae1d2
Compare
a2ae1d2 to
80cfdea
Compare
core/trino-main/src/test/java/io/trino/node/TestNodeInventoryConfig.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/node/TestNodeInventoryConfig.java
Outdated
Show resolved
Hide resolved
| { | ||
| private List<URI> coordinatorUris = List.of(); | ||
|
|
||
| @NotNull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be @NotEmpty? Is there a purpose to configure discovery.type=announce with an empty list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Coordinator does not need an announcement unless there are multiple coordinators.
| private List<URI> coordinatorUris = List.of(); | ||
|
|
||
| @NotNull | ||
| public List<@NotNull URI> getCoordinatorUris() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does using @NotNull here do anything? Can the config system create a list with null elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means you cannot have a null element in the list. It isn't required, but it is reasonable documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand what it does, but seems confusing because it should be impossible. We don't do that anywhere else, but I'm fine if you want to leave it.
core/trino-main/src/main/java/io/trino/node/AnnounceNodeInventory.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/node/TestAnnounceNodeInventory.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/test/java/io/trino/node/TestAnnounceNodeInventory.java
Outdated
Show resolved
Hide resolved
core/trino-main/src/main/java/io/trino/node/DnsNodeInventory.java
Outdated
Show resolved
Hide resolved
| checkArgument(port > 0 && port < 65536, "invalid port: %s", port); | ||
| this.hosts = requireNonNull(hosts, "hosts is null") | ||
| .stream() | ||
| .filter(not(Strings::isNullOrEmpty)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to get a null host from config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, but I'm being careful with announcement configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The real reason is that the docs for getAllByName say that null or empty result in the loopback address being returned and I don't want to deal with that
core/trino-main/src/main/java/io/trino/node/DnsNodeInventory.java
Outdated
Show resolved
Hide resolved
Nodes only announce their internal URI every 5 seconds, as opposed to a full discovery announcement. The announcement server simply keeps a list of known workers that times out after 30 seconds. The existing internal node manager code handles the rest, reaching out to the node to get state and any other critical information. The announcer client uses the same discovery.uri property name as before, and the new code supports multiple names if there are multiple coordinators to announce to. The system used is determined by the discovery.type property which currently can be announce or airlift_discovery.
80cfdea to
96e4ebd
Compare
| Futures.addCallback( | ||
| responseFuture, new FutureCallback<>() | ||
| { | ||
| @Override | ||
| public void onSuccess(StatusResponse response) | ||
| { | ||
| int statusCode = response.getStatusCode(); | ||
| if (statusCode < 200 || statusCode >= 300) { | ||
| logWarning("Failed to announce node state to %s: %s", announceUri, statusCode); | ||
| } | ||
| } | ||
|
|
||
| @Override | ||
| public void onFailure(Throwable t) | ||
| { | ||
| logWarning("Error announcing node state to %s: %s", announceUri, t.getMessage()); | ||
| } | ||
| }, directExecutor()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this could be
Futures.addCallback(addSuccessCallback(responseFuture, () -> {
...
});
Futures.addCallback(addExceptionCallback(responseFuture, t ->
logWarning("Error announcing node state to %s: %s", announceUri, t.getMessage()));|
I put the followup suggestions in #26125 |
Description
This PR adds two new node inventory systems: Announce and DNS. Announce is the new default and replaces Airlift Discovery. The system used is determined by the
discovery.typeproperty which currently can beannounce,dnsorairlift-discovery.Announce inventory is a trivial announcement service for nodes. Nodes only announce their internal URI every 5 seconds, as opposed to a full discovery announcement. The announcement server simply keeps a list of known workers that times out after 30 seconds. The existing internal node manager code handles the rest, reaching out to the node to get state and any other critical information. The announcer client uses the same
discovery.uriproperty name as before, so no configuration changes are needed. The new announcer supports multiple names so announcements can be sent to multiple coordinators.Secondly, the PR adds a DNS based node manager. Instead of announcement, this relies on K8s style DNS infrastructure. In this setup it is expect that there are one or more hostnames that will provide all IP addresses for the workers in the cluster. It is assumed that all nodes are bound to the same port and mixed HTTP / HTTPS is not allowed (all nodes must be configured the same way). The hostnames are configured with the
cluster.hostsproperty. DNS does not have an announcement system, so no configuration is needed on workers.Release notes
(x) Release notes are required, with the following suggested text: