Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dependent steps to start concurrently #2044

Closed
spring-projects-issues opened this issue Mar 22, 2010 · 3 comments
Closed

Allow dependent steps to start concurrently #2044

spring-projects-issues opened this issue Mar 22, 2010 · 3 comments

Comments

@spring-projects-issues
Copy link
Collaborator

Dave Syer opened BATCH-1538 and commented

Allow dependent steps to start concurrently: step1 can be producing data that are needed in step2, e.g. staging records, and step2 can start to process those as soon as they are available without waiting. All that is needed is a protocol for the steps to agree on whether a dependency is finished or in flight. (In the staging case step1 is hardly ever a limiting factor in terms of execution time, but it might still help.)


1 votes, 3 watchers

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Jun 26, 2012

Giovanni Dall'Oglio Risso commented

Excuse me, but this issue seems very similar to #2065.

IMHO you can think at something like "spring integration channels", between two steps

  • First step
    • when a chunk commits, you push the data in a queue
      *** you can insert the List writed by the ItemWriter, or to perform a sort of trasformation, defining a delegate
  • Second step
    • You use the data coming from the queue to substitute the IO-consuming ItemReader
    • you use the real ItemReader only in case of restart

There are some things to clarify (eg: how is managed the restart? How manage different chunk sizes?), but this way can be feasible.

At the moment, to save IO time, we overloaded one step:

  • ItemReader
    • a simple reader
  • ItemProcessor
    • a CompositeItemProcessor, with a long-list of processors, that do the operations of multiple steps, all together
  • ItemWriter
    • a CompositeItemWriter, that write everything (really a lot of things)
      *** followed by a list of FilterItemWriter,
      *** that delegates the real IO to other Writers

Obviously: this solution help to save IO time, but makes the things harder to understand and change. And this is a bad thing. More: this solution mantains the operations single-thread-sequential.

Your solution (a pipeline) is ways better: enable the developers to save IO time, design a cleaner jobs, and allow the processes to be broken into different threads (one for chunk).

@spring-projects-issues
Copy link
Collaborator Author

Dave Syer commented

Yes, I agree this is basically a duplicate.

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Nov 12, 2018

Mahmoud Ben Hassine commented

I implemented a POC here for concurrent steps using a blocking queue as a staging area. It is typically an implementation of the producer/consumer pattern. However, I don't see (yet) how this could be provided as a built-in feature in the framework.

If the POC makes sense, I would add it as a sample to the samples module rather than implement it as a feature in the framework (other than probably adding the BlockingQueueItemReader and BlockingQueueItemWriter to the library of readers/writers).

Any thoughts?

@fmbenhassine fmbenhassine added in: core and removed status: waiting-for-triage Issues that we did not analyse yet labels Oct 11, 2024
@fmbenhassine fmbenhassine added this to the 5.2.0-M2 milestone Oct 11, 2024
@fmbenhassine fmbenhassine changed the title Allow dependent steps to start concurrently [BATCH-1538] Allow dependent steps to start concurrently Oct 11, 2024
FBibonne pushed a commit to FBibonne/spring-batch that referenced this issue Feb 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants