Skip to content

rds: extracting base classes as reusable generic routing DS implementation.#18846

Merged
htuch merged 16 commits intoenvoyproxy:mainfrom
tkovacs-2:generic_rds
Feb 15, 2022
Merged

rds: extracting base classes as reusable generic routing DS implementation.#18846
htuch merged 16 commits intoenvoyproxy:mainfrom
tkovacs-2:generic_rds

Conversation

@tkovacs-2
Copy link
Copy Markdown
Contributor

@tkovacs-2 tkovacs-2 commented Nov 1, 2021

Commit Message:
Extracting base classes from the existing code as reusable generic routing DS implementation.

Additional Description:

  • The current RDS implementation is tied to the HTTP protocol. To make it reusable for other protocols
    base classes containing the generic part are extracted from the existing code.
  • The Config and RouteConfiguration proto classes are opaque for the generic implementation.
  • The location of the base classes are envoy/rds and source/common/rds and the Envoy::Rds namespace.
  • Existing code is reworked to use the base classes and HTTP specific features added back by decorator pattern
    (StaticRouteConfigProviderImpl, RdsRouteConfigProviderImpl, RouteConfigUpdateReceiverImpl)
    and template method pattern (RdsRouteConfigSubscription).
  • One thing which has to be implemented separately for every protocol is the Rds::ConfigTraits and Rds::ProtoTraits interface
    which provides basic information about the otherwise opaque classes.
  • Other is a wrapper around the Rds::RouteConfigProviderManagerImpl which creates the provider
    objects with the proper types.

This PR is split from #17631

Risk Level:
Medium

Testing:

  • Unittest
  • Manual test with static routing from bootstrap config, static routing and dynamic routing from GRPC based xDS server.

Docs Changes:

Release Notes:

Platform Specific Features:

Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
@asraa
Copy link
Copy Markdown
Contributor

asraa commented Nov 1, 2021

@adisuissa do you want to take a first pas over this (see #17631 (comment))?

@asraa asraa requested a review from adisuissa November 1, 2021 15:16
@tkovacs-2
Copy link
Copy Markdown
Contributor Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18846 (comment) was created by @tkovacs-2.

see: more, trace.

@tkovacs-2
Copy link
Copy Markdown
Contributor Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18846 (comment) was created by @tkovacs-2.

see: more, trace.

Copy link
Copy Markdown
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments.
This is a non-trivial change, and I'm still looking at the rest of the refactor here.

public:
virtual ~RouteConfigProviderManager() = default;

virtual void eraseStaticProvider(RouteConfigProvider* provider) PURE;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the parameter be const?
When passing a pointer it typically means that it may be null. Is this the case here?

Copy link
Copy Markdown
Contributor Author

@tkovacs-2 tkovacs-2 Nov 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually yes, any value is valid here. The pointer will be used just for lookup in a set.
It could be const but in the original implementation the set contained non-const pointers.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, I'm wondering how much of what the RDS config distributor does is really routing specific? My understanding is we have something that can (1) create immutable shared data structures from proto (2) distribute/share this with worker threads and (2) act as a warming target. @adisuissa what are your thoughts on making that non-routing specific in a later PR if not this one?

Tamas Kovacs added 2 commits November 8, 2021 17:09
Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
@tkovacs-2
Copy link
Copy Markdown
Contributor Author

I have moved the logic from the protocol specific manager completely to the Rds::RouteConfigProviderManagerImpl.
Now the Envoy::Rds contains every logic and no dependency on the specific route config type.

@rojkov
Copy link
Copy Markdown
Member

rojkov commented Nov 15, 2021

Looks like the main branch has evolved recently and needs to be merged.

/wait-any

Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
@tkovacs-2
Copy link
Copy Markdown
Contributor Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18846 (comment) was created by @tkovacs-2.

see: more, trace.

Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
@tkovacs-2
Copy link
Copy Markdown
Contributor Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Check envoy-presubmit isn't fully completed, but will still attempt retrying.
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18846 (comment) was created by @tkovacs-2.

see: more, trace.

Tamas Kovacs added 2 commits November 16, 2021 22:36
Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
@tkovacs-2 tkovacs-2 requested a review from adisuissa November 18, 2021 04:01
@tkovacs-2 tkovacs-2 requested a review from htuch January 12, 2022 00:07
@htuch
Copy link
Copy Markdown
Member

htuch commented Jan 13, 2022

Is it ok or the lowercase rds type parameter should be appended automatically?

Probably fine, assuming there is a stat_prefix from each consumer supplied, it should be unique anyway.

Is there still any sense to use absl::optional to store the pointer back to the route config rovider?

Not sure? If it's not used, I'd drop it.

Is it ok, or drop and refer to RouteConfigProviderManagerImpl directly?

If the indirection isn't doing anything functional, I'd skip it.

@htuch
Copy link
Copy Markdown
Member

htuch commented Jan 13, 2022

Thanks so much for the detailed commentary, very helpful. @adisuissa said he would take a look shortly.

I've taken another pass over the implementation (not tests). LGTM,

@envoyproxy/senior-maintainers this is a pretty major refactor of RDS (mostly mechanical). It doesn't have any runtime guarding, which would be hugely messy to implement I think. Any thoughts on whether we should be asking for runtime guarding here?

I'm a little worried about the impact this PR might have

@tkovacs-2
Copy link
Copy Markdown
Contributor Author

tkovacs-2 commented Jan 13, 2022

Is there still any sense to use absl::optional to store the pointer back to the route config rovider?

Not sure? If it's not used, I'd drop it.

I'm asking this because it is little cumbersome to downcast the provider pointer inside reference to absl::optional and provide the Router::RdsRouteConfigSubscription::routeConfigProvider() method. That is why I rather deleted it in the last commit but I don't like even this solution.
As I understand the RdsRouteConfigSubscription and RdsRouteConfigProviderImpl is created or destroyed together. The provider's constructor sets the this into route_config_provider_opt_ of the subscription which is used in onConfigUpdate().
Can it somehow happen the onConfigUpdate is called before the provider sets the this? I think no, but maybe I miss something.

@htuch
Copy link
Copy Markdown
Member

htuch commented Jan 14, 2022

I don't think so, that sounds broken if it is possible.

Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
@tkovacs-2
Copy link
Copy Markdown
Contributor Author

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #18846 (comment) was created by @tkovacs-2.

see: more, trace.

Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
@tkovacs-2
Copy link
Copy Markdown
Contributor Author

@htuch

I'm a little worried about the impact this PR might have

Actually I have a feeling this change will be never merged, however I also understand concerns because the risks.
So how about to split it into two again? One is the new common/rds/ part other is the common/router/ part.
The first has no risk so it could be merged more easily. However introduces some duplication temporarily but at least it will have own unittests. Now it is mostly tested by common/router/ which is a little odd.
Then we could merge the old thrift part, which has low risk and its major part, the interface change, was already accepted.
Finally the common/router/ part.

@htuch
Copy link
Copy Markdown
Member

htuch commented Jan 19, 2022

@tkovacs-2 yep. Let me ping maintainers again. I rather simplify and merge the current PR as is if folks will accept the potential risk on control plane (and potential for rollback). Otherwise we should follow your strategy.

@htuch
Copy link
Copy Markdown
Member

htuch commented Jan 20, 2022

I'm leaning towards merging without flag protection, since I think the only substantive change that might impact logic in a way that would not be obvious during canary and that poses risk here is around onRdsUpdate and it seems correct to me. @adisuissa will take another look as well.

@tkovacs-2 do you agree with this assessment or are there are other parts of the PR you would consider risky?

The other options is to consider increasing test coverage around the areas that seem risky.

I do worry a bit about the dangers of having large amounts of duplicate code flag protected, we know from the unified gRPC mux experience this is really hard to maintain and is not a healthy state for the code base during the period of flag protection.

Copy link
Copy Markdown
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.
I left a few minor comments throughout the code (one thing to note: these comments are on the entire code that I've reviewed, and the underlying code could have been present before you made the refactor).
The main struggle I'm having when reviewing this PR, is that it is unclear what is new and what is moved/refactored. Ideally having a series of small PRs where the focus of each PR is on a single refactor, would be easier to verify and to convince ourselves that the changes won't break things.

All in all it this seems correct to me.

*/
struct RdsStats {
ALL_RDS_STATS(GENERATE_COUNTER_STRUCT, GENERATE_GAUGE_STRUCT)
Rds::StaticRouteConfigProviderImpl base_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is Rds::StaticRouteConfigProviderImpl a composed field (base_) instead of deriving the class from Rds::StaticRouteConfigProviderImpl?

Copy link
Copy Markdown
Contributor Author

@tkovacs-2 tkovacs-2 Jan 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Router::RouteConfigProvider adds couple methods to the Rds::RouteConfigProvider interface.
If I inherit from Rds::StaticRouteConfigProviderImpl those method are lost. However for static provider they are mostly not called, but called for dynamic providers. And static and dynamic providers should implement the same interface.
Inherit form both Rds::StaticRouteConfigProviderImpl and Router::RouteConfigProvider is also possible, but that leads to dimond inheritance and I thought the decorator pattern is better.

@htuch
Copy link
Copy Markdown
Member

htuch commented Jan 21, 2022

The main struggle I'm having when reviewing this PR, is that it is unclear what is new and what is moved/refactored

#18846 (comment) has some explanation of this.

@tkovacs-2
Copy link
Copy Markdown
Contributor Author

tkovacs-2 commented Jan 26, 2022

@htuch

I rather simplify and merge the current PR as is if folks

What simplification do you mean?

are there are other parts of the PR you would consider risky?

I wouldn't say risky but maybe these two parts worth bigger attention:

  • Rds::RouteConfigProviderManager the hadling of init targets.
  • Rds::RdsRouteConfigSubscription::onConfigUpdate and the beforeProviderUpdate/afterProviderUpdate from Router::RdsRouteConfigSubscription

Also one question:
I added Router::RouteConfigUpdateReceiver::protobufConfigurationCast and Router::RouteConfigProvider::configCast methods to the interfaces to provide the downcasted routing config instead the generic opaque one.
But I'm not sure this is the best solution. Maybe some utility functions in the namespace would be better. Or any other idea?

Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
@tkovacs-2 tkovacs-2 requested a review from adisuissa January 26, 2022 14:55
Copy link
Copy Markdown
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this refactor!
I've left a few comments/questions about the specific code segments you asked to validate.

@adisuissa
Copy link
Copy Markdown
Contributor

Also one question: I added Router::RouteConfigUpdateReceiver::protobufConfigurationCast and Router::RouteConfigProvider::configCast methods to the interfaces to provide the downcasted routing config instead the generic opaque one. But I'm not sure this is the best solution. Maybe some utility functions in the namespace would be better. Or any other idea?

IMHO using template method in this case (if I understand correctly) makes sense. If we use a utility function, then I'm assuming that it will have some kind of if/switch statement, which is less favorable (e.g., if new types (if any) will need to be added). Another option is to use templated methods, but then it would look very similar to the static_cast call.

@adisuissa
Copy link
Copy Markdown
Contributor

/wait

Signed-off-by: Tamas Kovacs <tamas.2.kovacs@nokia-sbell.com>
Copy link
Copy Markdown
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work on this!
LGTM, modulo design question for @htuch here

@tkovacs-2
Copy link
Copy Markdown
Contributor Author

tkovacs-2 commented Feb 8, 2022

IMHO using template method in this case (if I understand correctly) makes sense. If we use a utility function, then I'm assuming that it will have some kind of if/switch statement, which is less favorable (e.g., if new types (if any) will need to be added). Another option is to use templated methods, but then it would look very similar to the static_cast call.

For example there is this call from vhds.cc: config_update_info_->protobufConfiguration().vhds().config_source()
Previously the Router::RouteConfigUpdateReceiver::protobufConfiguration gave back envoy::config::route::v3::RouteConfiguration but now just Protobuf::Message. The underlying object is still the same, but a static cast is needed to get back the full access.
For this I intoduced the Router::RouteConfigUpdateReceiver::protobufConfigurationCast but it has drawbacks, like:

  • Additional changes required for example in the mocks because the new method in the interface.
  • Such doesn't really belong to the interface.

I meant with the utility function the above call would look like this:
cast(config_update_info_->protobufConfiguration()).vhds().config_source()
The cast() would be defined in the Router namespace with the same signature and body like protobufConfigurationCast now.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the one comment I left.

@htuch htuch merged commit e727e1f into envoyproxy:main Feb 15, 2022
@htuch
Copy link
Copy Markdown
Member

htuch commented Feb 15, 2022

@tkovacs-2 thanks for your patient and working through the feedback. @adisuissa appreciate the review bandwidth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants