-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Expected Behavior
The default behavior of Scatter-Gather is:
applySequenceis automatically set totrueand can be explicitly set tofalseif desired- the aggregator release-strategy is to wait for all messages (using
sequenceSize) - the output processor (payload aggregator of the abstract MessageGroupProcessor) recognizes when all messages contain Collections of the same type and returns a new collection containing all elements of all aggregated collections.
In summary, the default behaviour should be the most simple use case requiring zero configuration and zero special knowledge of the internals of how scatter gather was implemented in spring integration.
- I tell the scatterer where to get the data and I transform the individual responses into a uniform data type.
- I then expect the gatherer to wait for all the data from the scatterer to return (that's why I need applySequence() to be true by default and releaseStrategy() to wait for all messages to return by default), and if the return data are lists (which I would argue is a common use case), the aggregator is capable of automatically returning a new collection containing all the elements of the individual lists)
Any deviation from this standard simple case can be configured as desired and requires special knowledge of the internals of this spring integration (or dsl) library.
Current Behavior
- If you don't provide
applySequence(true), the result is typically an exception at runtime mentioning a missing corellation strategy - The default release strategy is to release each message as it arrives (see context for why this is contrary to the EIP definition of the scatter gather pattern). Also, there's no simple way to just tell the releaseStrategy to wait for all message to arrive. Instead I have to hard code something like
group -> group.size == 2making this brittle for refactoring if I add another recipientFlow to the scatterer. At the very lease I would expect an option to just tell the aggregator to wait for all messages instead of having to hard code how many subflows there are. - when receiving multiple lists of responses from the scatterer, the default aggregator behaviour is to return a list of lists, instead of a single list containing all aggregated elements
Hence using Scatter Gather for a simple intuitive default case (see context below) already requires special knowledge of the internals of how this was implemented in spring integration.
Context
According to this EIP definition of Scatter-Gather (https://www.enterpriseintegrationpatterns.com/BroadcastAggregate.html), "The Scatter-Gather routes a request message to the a number of recipients. It then uses an Aggregator to collect the responses and distill them into a single response message."
I have emphasized distill them into a single response message because this is what I would expect scatter gather to do by default and what inspires my suggestions for the default behaviour above.
The reason for the current default behavior was given at https://stackoverflow.com/a/68403033/349169: "The scatterer part of this component is fully based on the Recipient List Router which comes with false for that option by default. So, for consistency and runtime optimization we keep it false in scatter-gather as well".
In reply to "for runtime optimization":
Spring and especially Spring Boot are known and loved for being zero configuration frameworks reducing tedious boilerplate code and providing useful defaults that can be extended when desired. I would argue that any successful framework should prioritize intuitive default behaviors over premature optimization, allowing a quick and easy entry bar into using the framework, and leaving technical internals only for those occations when we have a need for optimizing beyond the simple default use cases.
In reply to "for consistency":
According to the EIP quote above, the idea of the scatter gather pattern and in particular the aggregator is to "distill" the gathered messages "into a single response message". As a user of the scatter gather functionality, I would expect that I can use it and it "just works" according to this explanation. All I want to tell it for the default case is 1. go get information here, here and here, and 2. give me back everything you received. The second part shouldn't need any configuration at all in any case where the return data is all of the same type. Just collect the received messages and return them in a single message. End of story. If the goal of a scatter gather pattern is to "distill them into a single response message", the default behaviour of the releaseStrategy should be to wait for all messages instead of releasing each arriving message individually as multiple response messages.
All deviations from this simple intuitive case can be dealt through additional configuration, such as setting applySequence() to false, setting releaseStrategy() to some other logic, and providing a custom outputProcessor().