Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement when finding tasks in a subworkflow #425

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

jbirddog
Copy link
Contributor

@jbirddog jbirddog commented Jul 3, 2024

Before a subworkflow would use an exception based approach to finding a task by id. When profiling an example multi-instance workflow this could result in ~1 million exceptions being thrown just so None could be returned. Fix is to just directly call get on self.tasks. For the example process this improves runtime by about 22% and results in ~3 million less function calls. Timings below are from spiff-arena's backend.

Before:

         19118665 function calls (18030458 primitive calls) in 56.559 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  1801934    2.268    0.000    2.268    0.000 {built-in method builtins.hash}
  1798819    4.427    0.000    6.690    0.000 uuid.py:268(__hash__)
1749267/1749091    3.280    0.000    5.986    0.000 {method 'get' of 'dict' objects}
  1344447    1.769    0.000    1.780    0.000 {built-in method builtins.isinstance}
  1047694    2.238    0.000    2.346    0.000 uuid.py:280(__str__)
  1047398    6.325    0.000   18.235    0.000 workflow.py:142(get_task_from_id)
  1037940    3.476    0.000   22.200    0.000 subworkflow.py:64(get_task_from_id)
  1027950    3.296    0.000    5.244    0.000 exceptions.py:42(__init__)
  1027950    1.939    0.000    1.939    0.000 exceptions.py:24(__init__)

After:

         15938976 function calls (14850242 primitive calls) in 44.234 seconds

   Ordered by: call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
2773162/2771613    5.985    0.000   12.557    0.000 {method 'get' of 'dict' objects}
  1791937    2.241    0.000    2.241    0.000 {built-in method builtins.hash}
  1788829    4.373    0.000    6.677    0.000 uuid.py:268(__hash__)
1341144/1340474    1.770    0.000    1.780    0.000 {built-in method builtins.isinstance}
  1037940    2.586    0.000    9.273    0.000 subworkflow.py:64(get_task_from_id)
   423337    1.129    0.000    1.806    0.000 registry.py:51(clean)
419712/8329    2.792    0.000    8.322    0.001 dictionary.py:87(convert)
419712/8329    1.816    0.000    8.464    0.001 registry.py:39(convert)
   293445    0.370    0.000    0.370    0.000 task.py:139(has_state)

@essweine
Copy link
Contributor

essweine commented Jul 3, 2024

I have concerns about this PR so do not merge yet.

@essweine essweine marked this pull request as draft July 3, 2024 14:11
@essweine
Copy link
Contributor

essweine commented Jul 3, 2024

There was originally a reason for this (I think it had to do with error handling) but I don't think it's relevant at this point.

@essweine essweine marked this pull request as ready for review July 3, 2024 17:34
@essweine essweine merged commit a5dfbb3 into main Jul 3, 2024
5 checks passed
@essweine essweine deleted the some_perf2 branch July 3, 2024 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants