Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding is_optional case to RLS #29259

Merged
merged 10 commits into from
Apr 1, 2022
Merged

Adding is_optional case to RLS #29259

merged 10 commits into from
Apr 1, 2022

Conversation

donnadionne
Copy link
Contributor

@donnadionne donnadionne commented Mar 30, 2022

This change is Reviewable

Copy link
Contributor Author

@donnadionne donnadionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, 3 unresolved discussions (waiting on @markdroth)

a discussion (no related file):
I just have a small issue where "bool is_optional = 2;" is added to route.proto in envoyproxy/envoy, and we are using envoyproxy/data-plane-api, I need to figure out how to get the added proto field.

For now, I just hard coded is_optional to true to finish the code and test it out. And I am ready for you to take a look at the code changes while I figure out how to get this new proto field.



src/core/ext/xds/xds_cluster_specifier_plugin.h, line 69 at r1 (raw file):

  static void PopulateSymtab(upb_DefPool* symtab);

  static absl::StatusOr<std::string> GenerateLoadBalancingPolicyConfig(

This method is now broken up so that we can do a GetType for a check, and then Generate and parse.


src/core/ext/xds/xds_cluster_specifier_plugin.cc, line 115 at r1 (raw file):

  auto it = g_plugin_registry->find(config_proto_type_name);
  if (it == g_plugin_registry->end()) return nullptr;
  return it->second.get();

returning the raw pointer like this is not ideal but it is safe as we know the unique ptr will not be freed yet.

Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start!

Please let me know if you have any questions. Thanks!

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @donnadionne)

a discussion (no related file):

Previously, donnadionne wrote…

I just have a small issue where "bool is_optional = 2;" is added to route.proto in envoyproxy/envoy, and we are using envoyproxy/data-plane-api, I need to figure out how to get the added proto field.

For now, I just hard coded is_optional to true to finish the code and test it out. And I am ready for you to take a look at the code changes while I figure out how to get this new proto field.

You can get the new field by upgrading the envoyproxy/data-plane-api dependency to the latest commit from that repo.



src/core/ext/xds/xds_cluster_specifier_plugin.h, line 69 at r1 (raw file):

Previously, donnadionne wrote…

This method is now broken up so that we can do a GetType for a check, and then Generate and parse.

This is fine. It's a little sub-optimal in that if we ever have a second plugin, we should ideally not make each individual plugin repeat the code to call the gRPC LB policy registry to do the parsing of the resulting config. But we can refactor that if/when the time comes to add a second plugin.


src/core/ext/xds/xds_cluster_specifier_plugin.h, line 70 at r1 (raw file):

      absl::string_view config_proto_type_name);

  static void PopulateSymtab(upb_DefPool* symtab);

Please put this method before GetPluginForType(), since it will be called first.


src/core/ext/xds/xds_cluster_specifier_plugin.cc, line 86 at r1 (raw file):

  Json::Array policies;
  policies.emplace_back(std::move(policy));
  Json lb_policy_config = Json(policies);

Json lb_policy_config(std::move(policies));


src/core/ext/xds/xds_cluster_specifier_plugin.cc, line 87 at r1 (raw file):

  policies.emplace_back(std::move(policy));
  Json lb_policy_config = Json(policies);
  grpc_error_handle parse_error = GRPC_ERROR_NONE;

Please add a TODO here:

// TODO(roth): If/when we ever add a second plugin, refactor this code
// somehow such that we automatically validate the resulting config against
// the gRPC LB policy registry instead of requiring each plugin to do that itself.


src/core/ext/xds/xds_cluster_specifier_plugin.cc, line 115 at r1 (raw file):

Previously, donnadionne wrote…

returning the raw pointer like this is not ideal but it is safe as we know the unique ptr will not be freed yet.

This is actually fine, since the registry lifetime should exceed that of the usage here.


src/core/ext/xds/xds_route_config.h, line 182 at r1 (raw file):

           std::string /*LB policy config*/>
      cluster_specifier_plugin_map;
  std::set<std::string> ignored_cluster_specifier_plugin_set;

Instead of adding a separate set here, I suggest just storing these entries in cluster_specifier_plugin_map with an empty string as the value. (The empty string is not a valid LB policy config, so that will not conflict with any value returned by any plugin.)

Please add a comment above the field documenting this usage.


src/core/ext/xds/xds_route_config.cc, line 311 at r1 (raw file):

    parts.push_back(it);
  }
  parts.push_back("}");

Need another copy of this line above line 307 to end the previous block.

That having been said, with the change I suggested in the .h file, this should instead just log "" if the LB policy config string is empty.


src/core/ext/xds/xds_route_config.cc, line 330 at r1 (raw file):

        envoy_config_route_v3_ClusterSpecifierPlugin_extension(
            cluster_specifier_plugin[i]);
    std::string name = UpbStringToStdString(

I just noticed that we have a pre-existing bug from the previous PR: right after this, we should check that the name does not already exist in rds_update->cluster_specifier_plugin_map, and if it does, return an error.

Also, please add a test for this case.


src/core/ext/xds/xds_route_config.cc, line 348 at r1 (raw file):

        XdsClusterSpecifierPluginRegistry::GetPluginForType(plugin_type);
    if (cluster_specifier_plugin_impl == nullptr) {
      if (is_optional) {

If is_optional is false, we need to return an error here, or else we will fall through to line 356, where we will dereference the null pointer in cluster_specifier_plugin_impl.

Suggest writing this as:

std::string lb_policy_config;
if (cluster_specifier_plugin_impl == nullptr) {
  if (!is_optional) {
    return GRPC_ERROR_CREATE_FROM_CPP_STRING(
        absl::StrCat("unknown ClusterSpecifierPlugin type ", plugin_type));
  }
  // Optional plugin, leave lb_policy_config empty.
} else {
  auto config =
      cluster_specifier_plugin_impl->GenerateLoadBalancingPolicyConfig(
          google_protobuf_Any_value(any), context.arena, context.symtab);
  if (!config.ok()) {
    return absl_status_to_grpc_error(lb_policy_config.status());
  }
  lb_policy_config = std::move(*config);
}
rds_update->cluster_specifier_plugin_map[std::move(name)] =
    std::move(lb_policy_config);

src/core/ext/xds/xds_route_config.cc, line 792 at r1 (raw file):

          "RouteAction cluster contains empty cluster specifier plugin name.");
    }
    if (cluster_specifier_plugin_map.find(plugin_name) !=

With the change I suggested in the .h file, this can become:

auto it = cluster_specifier_plugin_map.find(plugin_name);
if (it == cluster_specifier_plugin_map.end()) {
  return GRPC_ERROR_CREATE_FROM_CPP_STRING(
      absl::StrCat("RouteAction cluster specifier plugin ", plugin_name, " unknown"));
}
if (it->second.empty()) *ignore_route = true;
route->action.emplace<XdsRouteConfigResource::Route::RouteAction::
                          kClusterSpecifierPluginIndex>(
    std::move(plugin_name));

test/cpp/end2end/xds/xds_end2end_test.cc, line 7866 at r1 (raw file):

  ASSERT_TRUE(response_state.has_value()) << "timed out waiting for NACK";
  EXPECT_THAT(response_state->error_message,
              ::testing::HasSubstr("No valid routes specified."));

Instead of failing this, let's add a default route (one that does not use a ClusterSpecifierPlugin at all) after the route that points to the ClusterSpecifierPlugin. Then we can test that the traffic goes to that route instead of failing.

@donnadionne donnadionne added the release notes: no Indicates if PR should not be in release notes label Apr 1, 2022
Copy link
Contributor Author

@donnadionne donnadionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed according to code review. And the envoyproxy/data-plane-api is now updated too. All tests pass!

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @markdroth)

a discussion (no related file):

Previously, markdroth (Mark D. Roth) wrote…

You can get the new field by upgrading the envoyproxy/data-plane-api dependency to the latest commit from that repo.

Done.



src/core/ext/xds/xds_cluster_specifier_plugin.h, line 70 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Please put this method before GetPluginForType(), since it will be called first.

Done.


src/core/ext/xds/xds_cluster_specifier_plugin.cc, line 86 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Json lb_policy_config(std::move(policies));

Done.


src/core/ext/xds/xds_cluster_specifier_plugin.cc, line 87 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Please add a TODO here:

// TODO(roth): If/when we ever add a second plugin, refactor this code
// somehow such that we automatically validate the resulting config against
// the gRPC LB policy registry instead of requiring each plugin to do that itself.

Done.


src/core/ext/xds/xds_route_config.h, line 182 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Instead of adding a separate set here, I suggest just storing these entries in cluster_specifier_plugin_map with an empty string as the value. (The empty string is not a valid LB policy config, so that will not conflict with any value returned by any plugin.)

Please add a comment above the field documenting this usage.

Done.


src/core/ext/xds/xds_route_config.cc, line 311 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Need another copy of this line above line 307 to end the previous block.

That having been said, with the change I suggested in the .h file, this should instead just log "" if the LB policy config string is empty.

Done.


src/core/ext/xds/xds_route_config.cc, line 330 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

I just noticed that we have a pre-existing bug from the previous PR: right after this, we should check that the name does not already exist in rds_update->cluster_specifier_plugin_map, and if it does, return an error.

Also, please add a test for this case.

Done.


src/core/ext/xds/xds_route_config.cc, line 348 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

If is_optional is false, we need to return an error here, or else we will fall through to line 356, where we will dereference the null pointer in cluster_specifier_plugin_impl.

Suggest writing this as:

std::string lb_policy_config;
if (cluster_specifier_plugin_impl == nullptr) {
  if (!is_optional) {
    return GRPC_ERROR_CREATE_FROM_CPP_STRING(
        absl::StrCat("unknown ClusterSpecifierPlugin type ", plugin_type));
  }
  // Optional plugin, leave lb_policy_config empty.
} else {
  auto config =
      cluster_specifier_plugin_impl->GenerateLoadBalancingPolicyConfig(
          google_protobuf_Any_value(any), context.arena, context.symtab);
  if (!config.ok()) {
    return absl_status_to_grpc_error(lb_policy_config.status());
  }
  lb_policy_config = std::move(*config);
}
rds_update->cluster_specifier_plugin_map[std::move(name)] =
    std::move(lb_policy_config);

one of my test discovered this bug too when I had is_optional on/off


src/core/ext/xds/xds_route_config.cc, line 792 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

With the change I suggested in the .h file, this can become:

auto it = cluster_specifier_plugin_map.find(plugin_name);
if (it == cluster_specifier_plugin_map.end()) {
  return GRPC_ERROR_CREATE_FROM_CPP_STRING(
      absl::StrCat("RouteAction cluster specifier plugin ", plugin_name, " unknown"));
}
if (it->second.empty()) *ignore_route = true;
route->action.emplace<XdsRouteConfigResource::Route::RouteAction::
                          kClusterSpecifierPluginIndex>(
    std::move(plugin_name));

Done.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7866 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Instead of failing this, let's add a default route (one that does not use a ClusterSpecifierPlugin at all) after the route that points to the ClusterSpecifierPlugin. Then we can test that the traffic goes to that route instead of failing.

Done.

Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Remaining comments are all minor.

Please let me know if you have any questions. Thanks!

Reviewed 5 of 5 files at r2, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @donnadionne)


src/core/ext/xds/xds_route_config.h, line 179 at r2 (raw file):

  std::vector<VirtualHost> virtual_hosts;
  // The plugin is unsupported but optional, so it will be ignored.  We

Actually, I just realized that we don't need this comment at all, because we need the optional entries in this map internally while we're validating the resource, but these entries will be stripped off before the resource is returned by the XdsClient to the watchers.


src/core/ext/xds/xds_route_config.cc, line 306 at r2 (raw file):

  for (const auto& it : cluster_specifier_plugin_map) {
    parts.push_back(absl::StrFormat("%s={%s}\n", it.first,
                                    it.second.empty() ? "" : it.second));

No need to special-case the empty string. Just log the value unconditionally, even if it's empty.


src/core/ext/xds/xds_route_config.cc, line 331 at r2 (raw file):

    if (rds_update->cluster_specifier_plugin_map.find(name) !=
        rds_update->cluster_specifier_plugin_map.end()) {
      return GRPC_ERROR_CREATE_FROM_STATIC_STRING(

Please include the name of the plugin in this error message:

return GRPC_ERROR_FROM_CPP_STRING(
    absl::StrCat("duplicate definition of cluster_specifier_plugin ", name));

test/cpp/end2end/xds/xds_end2end_test.cc, line 7785 at r2 (raw file):

  const char* kNewClusterName = "new_cluster";
  const char* kNewEdsServiceName = "new_eds_service_name";
  // Populate new EDS resources.

No need to configure any CDS or EDS resources here, since they'll never be used.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7881 at r2 (raw file):

TEST_P(RlsTest,
       XdsRoutingClusterSpecifierPluginNacksUnknownSpecifierProtoOptional) {

s/Nacks/Ignores/


test/cpp/end2end/xds/xds_end2end_test.cc, line 7921 at r2 (raw file):

  // Ensure we ignore the cluster specifier plugin and send traffic according to
  // the default route.
  auto rpc_options = RpcOptions().set_metadata({{kRlsTestKey1, kRlsTestValue}});

There's no need to add the metadata here, since we're not giving the channel a config that would use it.

Copy link
Contributor Author

@donnadionne donnadionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed all comments. PTAL, thank you !

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @markdroth)


src/core/ext/xds/xds_route_config.h, line 179 at r2 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Actually, I just realized that we don't need this comment at all, because we need the optional entries in this map internally while we're validating the resource, but these entries will be stripped off before the resource is returned by the XdsClient to the watchers.

Done.


src/core/ext/xds/xds_route_config.cc, line 306 at r2 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

No need to special-case the empty string. Just log the value unconditionally, even if it's empty.

Done.


src/core/ext/xds/xds_route_config.cc, line 331 at r2 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Please include the name of the plugin in this error message:

return GRPC_ERROR_FROM_CPP_STRING(
    absl::StrCat("duplicate definition of cluster_specifier_plugin ", name));

Done.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7785 at r2 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

No need to configure any CDS or EDS resources here, since they'll never be used.

Done.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7881 at r2 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

s/Nacks/Ignores/

Done.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7921 at r2 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

There's no need to add the metadata here, since we're not giving the channel a config that would use it.

Done.

Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of small things left.

Please let me know if you have any questions. Thanks!

Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @donnadionne)


test/cpp/end2end/xds/xds_end2end_test.cc, line 7785 at r2 (raw file):

Previously, donnadionne wrote…

Done.

Looks like you removed it in the previous test, but not this one. That was a good change, but please remove it here too.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7881 at r2 (raw file):

Previously, donnadionne wrote…

Done.

You changed the wrong test. The previous test does NACK, but this one doesn't.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7868 at r3 (raw file):

       XdsRoutingClusterSpecifierPluginNacksUnknownSpecifierProtoOptional) {
  gpr_setenv("GRPC_EXPERIMENTAL_XDS_RLS_LB", "true");
  const char* kNewClusterName = "new_cluster";

No need to configure CDS or EDS resources in this test, either, since they won't be used.

Copy link
Contributor Author

@donnadionne donnadionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you so much!! PTAL thank you!

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @markdroth)


test/cpp/end2end/xds/xds_end2end_test.cc, line 7785 at r2 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Looks like you removed it in the previous test, but not this one. That was a good change, but please remove it here too.

Done.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7881 at r2 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

You changed the wrong test. The previous test does NACK, but this one doesn't.

Done.


test/cpp/end2end/xds/xds_end2end_test.cc, line 7868 at r3 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

No need to configure CDS or EDS resources in this test, either, since they won't be used.

need 1 eds setup, removed rest

Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @donnadionne)

@donnadionne donnadionne enabled auto-merge (squash) April 1, 2022 19:15
@donnadionne donnadionne merged commit 2fd632a into grpc:master Apr 1, 2022
markdroth added a commit to markdroth/grpc that referenced this pull request Apr 2, 2022
ctiller pushed a commit that referenced this pull request Apr 3, 2022
@copybara-service copybara-service bot added the imported Specifies if the PR has been imported to the internal repository label Apr 4, 2022
markdroth added a commit to markdroth/grpc that referenced this pull request Apr 12, 2022
markdroth added a commit to markdroth/grpc that referenced this pull request Apr 12, 2022
markdroth added a commit that referenced this pull request Apr 12, 2022
* Revert "Revert "Adding is_optional case to RLS (#29259)" (#29299)"

This reverts commit a6419dd.

* change test to check the full error message
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bloat/improvement imported Specifies if the PR has been imported to the internal repository lang/c++ lang/core per-call-memory/neutral release notes: no Indicates if PR should not be in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants