Proposed changes to framework #106

maumueller · 2023-04-17T19:21:53Z

The current workflow to run algorithm X on dataset Y is something like this:

python install.py builds docker container
python create_dataset.py sets up datasets
python run.py --dataset Y --algorithm X mounts data/, results/, benchmark/ into the container for X
- it takes care of parsing the definitions file and checking present runs to figure out which runs to carry out.
- py-docker is used to spawn the container from within the Python process
- results are written to results/
python plot.py / data_export.py / ... to evaluate the results

Given @harsha-simhadri's and @sourcesync's frustrations and some directions discussed in other meetings, I think we should relax step 3 a bit and allow more flexibility in the container setup. One direction could look like this:

python install.py builds docker container, participants are expected to overwrite the entry point to point to their own implementation (file algorithms/X/Dockerfile)
python create_dataset.py sets up datasets
A python/shell script that contains the logic to run the container for X, (in algorithms/X/run.{py,sh})
- as arguments, we provide task, dataset, where the results should be written, and some additional parameters
- we mount data/, results/, and the config file that is used by the implementation (algorithms/X/config.yaml, maybe task specific)
- The following is done by the implementation in the container:
  a. file I/O in the container, loading/building index
  b. running the experiment and providing timings
  c. writing results in a standard format (as before results/Y/X/run_identifier.hdf5)
python plot.py / data_export.py / ... to evaluate the results

We provide a default run script for inspiration, which would be pretty close to the current setup. Putting all the logic into the container could mean a lot of code duplication, but isolated containers will allow for a much easier orchestration.

I can provide a proof-of-concept if this sounds promising.

The text was updated successfully, but these errors were encountered:

harsha-simhadri · 2023-04-25T17:35:03Z

Martin, This sounds reasonable.

We also need to think about specialized runners for the tasks we are thinking about.

maumueller assigned sourcesync and harsha-simhadri Apr 17, 2023

maumueller mentioned this issue Apr 25, 2023

add neurips 2023 folder #107

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed changes to framework #106

Proposed changes to framework #106

maumueller commented Apr 17, 2023

harsha-simhadri commented Apr 25, 2023

Proposed changes to framework #106

Proposed changes to framework #106

Comments

maumueller commented Apr 17, 2023

harsha-simhadri commented Apr 25, 2023