Skip to content

follow up performance problems of large EUGW process grap #429

@soxofaan

Description

@soxofaan

follow up performance problems/opportunities of the large process graph use case from #426/#427

this is for example profiling dump of running test test_very_large_graph

def test_very_large_graph(dry_run_env, dry_run_tracer):
pg = load_json("pg/1.0/large_eugw_graph.json")["process_graph"]
save_result = evaluate(pg, env=dry_run_env)

(stripped the entries of pytest and other generic runners)


         416589268 function calls (307657453 primitive calls) in 153.792 seconds

   Ordered by: cumulative time
   List reduced from 9588 to 50 due to restriction <50>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000  155.456  155.456 __main__.py:1(<module>)
...
        1    0.000    0.000  153.603  153.603 test_dry_run.py:2955(test_very_large_graph)
        1    0.001    0.001  153.602  153.602 ProcessGraphDeserializer.py:437(evaluate)
7422036/2    6.592    0.000  126.161   63.080 ProcessGraphDeserializer.py:544(convert_node)
1248656/2    6.244    0.000  126.161   63.080 ProcessGraphDeserializer.py:1916(apply_process)
1179216/2    1.779    0.000  126.160   63.080 ProcessGraphDeserializer.py:1943(<dictcomp>)
 730618/2    0.816    0.000  126.160   63.080 ProcessGraphDeserializer.py:603(<listcomp>)
   133952    0.753    0.000   27.443    0.000 ProcessGraphDeserializer.py:839(load_collection)
        1    0.547    0.547   26.902   26.902 dry_run.py:395(get_source_constraints)
  1395978    1.508    0.000   21.798    0.000 dry_run.py:550(_process)
  1395978    0.671    0.000   18.940    0.000 dry_run.py:273(process_traces)
  1395978    5.031    0.000   18.269    0.000 dry_run.py:275(<listcomp>)
   133952    0.945    0.000   17.803    0.000 dry_run.py:277(load_collection)
  4994625    2.735    0.000   17.146    0.000 version.py:45(__init__)
  1248656    0.715    0.000   16.699    0.000 dummy_backend.py:702(get_process_registry)
  1248656    0.944    0.000   15.984    0.000 ProcessGraphDeserializer.py:331(get_process_registry)
  1248656    0.499    0.000   14.607    0.000 version.py:115(at_least)
  1248656    0.672    0.000   14.108    0.000 version.py:95(__ge__)
  2497313    9.385    0.000   13.771    0.000 version.py:55(_parse)
  1248656    3.019    0.000   13.436    0.000 version.py:80(_pad)
  8084052    5.671    0.000   11.379    0.000 dry_run.py:216(__init__)
  1248656    0.938    0.000    9.485    0.000 utils.py:106(openeo_api_version)
  1248656    0.391    0.000    9.426    0.000 utils.py:80(collect_parameters)
4994624/1248656    3.321    0.000    9.035    0.000 utils.py:64(collect)
44604776/1406496    8.675    0.000    8.811    0.000 dry_run.py:226(get_arguments_by_operation)
  6516114    1.921    0.000    8.342    0.000 utils.py:53(get)
        1    0.000    0.000    8.047    8.047 dry_run.py:369(get_trace_leaves)
        1    0.684    0.684    8.047    8.047 dry_run.py:386(<listcomp>)
12260030/4175978    4.601    0.000    7.363    0.000 dry_run.py:376(get_leaves)

note the extremely high call counts of convert_node and related.

1248656 calls for get_process_registry(and related version parsing) also seems excessive

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions