Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you give better examples of automagical spout/bolt wiring? #7

Open
ghost opened this issue Mar 24, 2014 · 5 comments
Open

Can you give better examples of automagical spout/bolt wiring? #7

ghost opened this issue Mar 24, 2014 · 5 comments
Labels

Comments

@ghost
Copy link

ghost commented Mar 24, 2014

How do you wire multiple bolts to a single spout? Or wire a bolt to the output of multiple bolts? It's pretty easy programmatically but not sure how do it with spring context file. The example in kickstarter project is too simple and it's ambiguous.

The README says: "The SpringSpout and SpringBolt classes are configured with a Spring bean and a method signature. The compiler automagically orders the processing steps based on the field names."

But in the kickstarter example the Greeter and Marker bolts both take 'number' as the only argument. So its not clear if they are both going to be wired to the spout for input or whether Marker will be wired to the output of Greeter. It was clear in the programmatic example.

Also, why was the programmatic config removed from the kickstarter example?

@pascaldekloe pascaldekloe self-assigned this Mar 24, 2014
@pascaldekloe
Copy link
Contributor

Breeze does the wiring for you. Just configure the requirements per step and it should just work.

How do you wire multiple bolts to a single spout?

With the following configuration both bolts will read the input from the spout.

<spout ... outputFields="a"/>
<bolt ... signature="f(a)"/>
<bolt ... signature="g(a)"/>

Or wire a bolt to the output of multiple bolts?

With the following configurtion b3 reads the output from b1 and b2.

<bolt id="b1" ... outputFields="x"/>
<bolt id="b2" ... outputFields="y"/>
<bolt id="b3" ... signature="f(x, y)"/>

Behind the scenes, currently the processing steps are sequential since we focus mostly on throughput, not latency. In case of ambiguity the compiler keeps the order as listed in the topology XML.

[starter-demo] INFO eu.icolumbo.breeze.build.TopologyCompilation - Compiled as: {[spout 'feed']=[[bolt 'greet'], [bolt 'mark'], [bolt 'register']]}

For parallel execution one option would be to split the stream automatically where possible. With something like "depends-on" the user may enforce a processing order when needed.
Apache Camel has splitters and aggregators for example.
Do you have any ideas and/or alternatives on how the functionality should appear?

The programmatic configuration example confused people. A kickstarter should demonstrate the ease of use. However, we could demonstrate more complicated flows indeed.

@ghost
Copy link
Author

ghost commented Mar 24, 2014

The kickstarter example doesn't match your example.

Here is your example:

<spout ... outputFields="a"/>
<bolt ... signature="f(a)"/>
<bolt ... signature="g(a)"/>

In this case both the bolts would have the spout as input.

Here is the spring config in kickstarter:

<breeze:spout id="feed" beanType="java.util.Random" signature="nextLong()" outputFields="number">
<breeze:transaction ack="setSeed(number)"/>
</breeze:spout>
<breeze:bolt id="greet" beanType="com.example.Greeter" signature="greet(number)" outputFields="heading"/>
<breeze:bolt id="mark" beanType="com.example.Marker" signature="mark(number)" outputFields="judge isOdd">

Based on your above example both the 'greet' and the 'mark' bolts would have the 'feed' spout as the only input.

The kickstarter programmatic config:

SpringBolt greet = new SpringBolt(Greeter.class, "greet(number)", "heading");
greet.setPassThroughFields("number");
builder.setBolt("greet", greet).noneGrouping("feed");

SpringBolt mark = new SpringBolt(Marker.class, "mark(number)", "source", "isEven");
mark.setPassThroughFields("heading");
builder.setBolt("mark", mark).noneGrouping("greet");

This shows the 'mark' bolt with only the 'greet' bolt as input.

That's why I say it's ambiguous and confusing.

I think the spring config should have an option to do explicit grouping.

For example have a sub field for bolts:

<breeze:bolt id="greet" beanType="com.example.Greeter" signature="greet(number)" outputFields="heading">
   <breeze:grouping source="feed" type="shuffle"/>
   <breeze:grouping source="someOtherInput" type="none"/>
</breeze:bolt>

@pascaldekloe
Copy link
Contributor

Effectively bolt mark and bold greet use spout feed as their input. However with true parallel execution (channel split) the latency could be improved and the traffic between steps (tuple fields) is reduced.

The split functionality should be easy to implement. Aggregation is a bit more tricky with high volumes. I want to prevent the reference bloat from Storm. For example Breeze can detect that mark and greet may run in parallel. It is also clear that the results need to be aggregated for register. On the other hand not everybody wants the aggregation overhead either.

How about the following?

<spout ... outputFields="x">
<split>
    <pipe>
        <bolt ... signature="f(x)" outputFields="a"/>
        <bolt ... signature="f(a)" outputFields="z"/>
    </pipe>
    <pipe>
        <bolt ... signature="f(x)" outputFields="z"/>
    </pipe>
</split>
<bolt ... signature="f(z)"/>

@jethrobakker ?

@ghost
Copy link
Author

ghost commented Mar 24, 2014

Effectively bolt mark and bold greet use spout feed as their input.

So the programmatic example in kickstarter didn't match the intended functionality of the spring config version from kickstarter?

@politie
Copy link
Collaborator

politie commented Mar 24, 2014

The programmatic example did match the intended functionality of the spring config. Three notes:

  1. The XML config has changed over time, due to new features in Breeze but the programmatic example did not changed
  2. The programmatic example was confusing for users of the framework, so we removed it.
  3. We did not use the grouping option yet, but we understand the need for such an option. You can create a pull request for such an option.

@ghost ghost unassigned pascaldekloe Jul 17, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant