Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues with partition and split [BATCH-2523] #1079

Closed
spring-projects-issues opened this issue Aug 10, 2016 · 3 comments
Closed
Labels
related-to: performance status: declined Features that we don't intend to implement or Bug reports that are invalid or missing enough details

Comments

@spring-projects-issues
Copy link
Collaborator

Damien DALY opened BATCH-2523 and commented

Hi,

I am trying to run Spring Batch with splitted/partitionned steps, to increase batch thoughput.

This is a one shot database migation (from Firebird to postgresql), I don't need to store job/state data, so I use a MapJobRepository and a SimpleJobLauncher. Configuration is done by annotations on static class members. There is only one application, no remote step execution, only local code.

I also created a ThreadPoolTaskExecutor.

I have a main flow, that starts sequentially 3 other flows : -->[flow1]-->[flow2]-->[flow3]--.

Flow1 is a split flow, containing single "classic" steps (reader, processor, writer, chunked) and some "partitioned" steps. Each split takes a new SimpleAsyncTaskExecutor instance.

Each partitioner creates lists of entity id (Integer[]) to process.

The TaskExecutor is a singleton of ThreadPoolTaskExecutor.

The performance issue I have seams that there is a long time when master steps are finalising child step executions. If I am right, it looks like a serialization/deserialization process happening to get child steps status/context.

How can I either change serializer/deserializer process, or bypass totally serialization ?
What can be "good" values for gridSize, thread pool size... ?

Thanks.


No further details from BATCH-2523

@spring-projects-issues
Copy link
Collaborator Author

Roan Brasil Monteiro commented

Do you have your code example?

@spring-projects-issues
Copy link
Collaborator Author

Mahmoud Ben Hassine commented

GrosDede Can you please provide a sample app to reproduce the case? It is not possible to analyse the eventual performance issue without running the actual code.

@spring-projects-issues spring-projects-issues added status: waiting-for-triage Issues that we did not analyse yet status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter type: task labels Dec 16, 2019
@fmbenhassine
Copy link
Contributor

fmbenhassine commented Jan 26, 2021

Performance issues with partition and split
I am trying to run Spring Batch with splitted/partitionned steps [..] I don't need to store job/state data, so I use a MapJobRepository

This is related to the poor performance of the Map-based job repository in a partitioning setup , see point "2. Poor Performance" in #3780 . The Map based job repository has been deprecated for removal, so I'm closing this issue.

@fmbenhassine fmbenhassine added status: declined Features that we don't intend to implement or Bug reports that are invalid or missing enough details and removed status: waiting-for-reporter Issues for which we are waiting for feedback from the reporter status: waiting-for-triage Issues that we did not analyse yet type: task labels Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
related-to: performance status: declined Features that we don't intend to implement or Bug reports that are invalid or missing enough details
Projects
None yet
Development

No branches or pull requests

2 participants