Max number of steps #4635
-
Hi all, I'm creating a batch in a banking context, that recreates Accounts' positions. Our first approach was to only have a batch working with chunk of x (~50) accounts and recompute for each account all the missing positions since the last execution. I identify 2 alternative approaches:
What is your recommandations / feeling ? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Does someone have an opinion on this question ? |
Beta Was this translation helpful? Give feedback.
-
The only limit to the number of steps is the amount of resources you allocate to the JVM. In my experience, the table step_execution becomes a bottle neck due to the frequent updates to the step executions and contexts (specifically if steps are running in parallel, like with partitioning). I don't think you need the dynamic number of steps approach (and end up with 1500 steps in your job). Have you thought about partitioning the input data set and setup a reasonable fixed number of parallel steps? Partitioning really works well in most cases (and is restartable in case of failures, only failed partitions are reprocessed). This should solve your problem as each worker step would have its own hibernate session with a subset of the data. |
Beta Was this translation helpful? Give feedback.
The only limit to the number of steps is the amount of resources you allocate to the JVM. In my experience, the table step_execution becomes a bottle neck due to the frequent updates to the step executions and contexts (specifically if steps are running in parallel, like with partitioning).
I don't think you need the dynamic number of steps approach (and end up with 1500 steps in your job). Have you thought about partitioning the input data set and setup a reasonable fixed number of parallel steps? Partitioning really works well in most cases (and is restartable in case of failures, only failed partitions are reprocessed). This should solve your problem as each worker step would have its …