Tools for parallel `slurm` processing across positions #141

talonchandler · 2023-06-02T02:32:42Z

The mantis project is depending on iohub in part because of the zarr format's ability to parallelize on SLURM. This issue is a request to clarify several behaviors of iohub, and to document best practices for parallel processing of hcs stores.

Question 1: When is plate-level and position-level metadata written?

When multiple jobs try to write metadata to a .zarr store at the same time, race conditions can occur. @edyoshikun and I have been struggling to find a way to use iohub that avoids these race conditions.

Can we create an empty hcs zarr in advance, close the store, then write to it later? Has this been tested?
When are you allowed to write to an existing .zarr and when are you not allowed? Any difference between hcs- and single-position .zarrs? How can we avoid zarr.errors.ContainsGroupError: path '0/27/0' contains a group-type errors?
Why are channel_names supplied to open_ome_zarr? For example, is the following expected behavior?

python
Python 3.9.16 | packaged by conda-forge | (main, Feb  1 2023, 21:39:03) 
[GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from iohub import open_ome_zarr
>>> dataset = open_ome_zarr('./test.zarr', layout="hcs", mode="a", channel_names=["1", "2"])
>>> dataset.create_position("0","0","0")
<iohub.ngff.Position object at 0x7f70d4a61850>
>>> dataset.close()
>>> quit()
(mantis-dev) [talon.chandler@gpu-sm01-10] ~/iohub
19:20:42 $ iohub info test.zarr/
Reading file:	 /home/talon.chandler/iohub/test.zarr
WARNING:root:Zarr group at 0/0/0 does not have valid metadata for <class 'iohub.ngff.Position'>
WARNING:root:Cannot determine channel names: Invalid metadata at the first position
WARNING:root:Zarr group at 0/0/0 does not have valid metadata for <class 'iohub.ngff.Position'>
Traceback (most recent call last):
  File "/home/talon.chandler/.conda/envs/mantis-dev/bin/iohub", line 8, in <module>
    sys.exit(cli())
  File "/home/talon.chandler/.conda/envs/mantis-dev/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/talon.chandler/.conda/envs/mantis-dev/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/talon.chandler/.conda/envs/mantis-dev/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/talon.chandler/.conda/envs/mantis-dev/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/talon.chandler/.conda/envs/mantis-dev/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/talon.chandler/iohub/iohub/cli/cli.py", line 43, in info
    print_info(file, verbose=verbose)
  File "/home/talon.chandler/iohub/iohub/reader.py", line 261, in print_info
    ch_msg = f"Channel names:\t {reader.channel_names}"
  File "/home/talon.chandler/iohub/iohub/ngff.py", line 138, in channel_names
    return self._channel_names
AttributeError: 'Plate' object has no attribute '_channel_names'

Question 2: How should we use iohub to read/write to multiple positions at the same time?

Do we need to prepopulate a zarr with empty positions? Does the example above exclude this possibility? Can/should iohub provide tools to assist with this?
How can we avoid race conditions when using SLURM? Has anything been successfully tested without race conditions and complete+correct metadata?

The text was updated successfully, but these errors were encountered:

JoOkuma · 2023-06-02T18:10:04Z

Hi @talonchandler, I'm still unfamiliar with iohub's zarr backend, so my comments below are regarding a default zarr array and SLURM.

zarr is thread-safe, but once you do multi-processing or beyond (different nodes), you're on your own, so you should take care of the race conditions.

When you're writing distinct chunks, there shouldn't be any issues (metadata is a different story, but most workflows don't need to update more than once). The only problem that comes to mind is that zarr's NestedDirectoryStore could be creating the directories only when the first chunks are allocated, leading to a race condition even with disjoint chunks.

Do we need to prepopulate a zarr with empty positions?
Does the example above exclude this possibility? Can/should iohub provide tools to assist with this?

This hasn't been an issue to me so far when using fov datasets, but NestedDirectoryStore might be problematic, see above. I'm not using hcs so I can't comment on the other questions.

How can we avoid race conditions when using SLURM?

You could use SLURM dependency to avoid race conditions. You can have task A to create the array and metadata and additional tasks B that process and save the data (assuming disjoint chunks) that depend on A to be finished.

Using dask or mpi are other options, but I they add much more complexity.

Has anything been successfully tested without race conditions and complete+correct metadata?

I have been testing and processing our datasets using our library slurmkit. No issues related to race conditions so far, but I always create the dataset/metadata before submitting the jobs using the same python script. Here is an example where I compute a registration model, apply them when fusing the data and estimate a flow field from the fused volume.

edyoshikun · 2023-06-02T20:09:42Z

Adding this for reference. It's basically what Jordao suggested. recOrder Slurm Scripts

ziw-liu · 2023-06-02T21:26:00Z

Why are channel_names supplied to open_ome_zarr? For example, is the following expected behavior?

See documentation. This is expected as no channel metadata is written. Here you would need to create image arrays to generate position metadata (which needs information about the names, transformations etc of the arrays in this position).

ziw-liu · 2023-06-02T21:33:19Z

Providing channel names to open_ome_zarr for both single-FOV and HCS stores is a deviation from the NGFF spec (#18). In theory each FOV can have different channels. And when reading an HCS store iohub would just assumes that they are all the same and get the metadata from an arbitrary FOV.

ziw-liu · 2023-06-02T21:39:29Z

The ContainsGroupError will occur in this kind of scenario:

'this' process opens the store and it contains one well (e.g. A/1)
another process in the batch job creates well B/1 (thus creating row B)
'this' process does not know that now row B exists
'this' process needs to create well B/2, and decides to create the existing row B again

ziw-liu · 2023-06-03T00:23:35Z

Another approach I can think of is to offer an utility to 'guess' HCS metadata from existing directory structure. The workflow will be like so:

use shell or python scripts in a single process to make some empty directories that has a plate structure down to the well level (plate.zarr/row/well/).

write individual FOVs in parallel, each process only sees a single FOV so there's no IPC needed.

fov = open_ome_zarr("plate.zarr/row/well/fov", layout="fov", mode="w-", ...)
fov['0'] = ...

use the said utility to infer .zattrs and .zgroup metadata files from what exists in the plate.zarr/ directory.

talonchandler · 2023-06-03T02:57:29Z

Thanks for the careful and helpful answers @JoOkuma + @ziw-liu, and thanks for the in-person discussion today @edyoshikun + @mattersoflight.

@ziw-liu I think the utility you're describing would be very useful, and my attempts to create such a utility have led me to error messages that are difficult for me to understand.

Can I suggest an example use case that will demonstrate SLURM processing of an hcs iohub .zarr?

I would suggest an example like the following:

Step 1: Create an hcs zarr store named ones.zarr full of ones with 2 rows, 3 columns, 4 fovs (4 per well), 5 times, 6 channels, 7 z, 8 y, 9 x. This can be done ahead of time with a small script.
Step 2: Run a single script that:
a) uses an iohub-provided utility to "mimic" the plate format of ones.zarr into a new target.zarr that is empty except for the plate-level metadata. I'm imagining something like iohub mimic ones.zarr -o target.zarr, but I'm open to other names or a python call.
b) uses SLURM to perform the following computation parallelized across positions (i.e. where each position is computed on a single node): the target.zarr should be filled with entries that are the product of their indices (e.g. the zeroth row/col/fov/etc is filled with zeros, e.g. the entry at 1/2/3[4,5,6,7,8] = 40320). The target.zarr should share its plate metadata with ones.zarr, but target.zarr should have 5 times, 6 channels, 7 z, 8 y, and 10 x to demonstrate iohub's ability to write to an array with a different size.

I'm more than happy to collaborate on this, but I would really appreciate your leadership @ziw-liu because I've been struggling to get this to work in the way you're describing.

ziw-liu · 2023-06-03T16:46:18Z

Another approach I can think of is to offer an utility to 'guess' HCS metadata from existing directory structure. The workflow will be like so:
use shell or python scripts in a single process to make some empty directories that has a plate structure down to the well level (plate.zarr/row/well/).
write individual FOVs in parallel, each process only sees a single FOV so there's no IPC needed.
fov = open_ome_zarr("plate.zarr/row/well/fov", layout="fov", mode="w-", ...)
fov['0'] = ...
use the said utility to infer .zattrs and .zgroup metadata files from what exists in the plate.zarr/ directory.

After some thought this util would have broader use if it does not require a certain directory structure beforehand. I can see how combining existing HCS stores or arbitrary collection of FOVs can be useful for e.g. pooling datasets for ML tasks.

JoOkuma · 2023-06-06T21:28:43Z

a) uses an iohub-provided utility to "mimic" the plate format of ones.zarr into a new target.zarr that is empty except for the plate-level metadata. I'm imagining something like iohub mimic ones.zarr -o target.zarr, but I'm open to other names or a python call.

I would avoid creating a command just to do this. I think the API should "just work" and avoid these problems.

In dexp, we took the CLI first approach, and I'm not happy with the results because it leads to complex command line scripts or duplication of commands when there's a new use case, which a better API could solve.

talonchandler · 2023-07-24T22:55:47Z

This can close from my perspective. @edyoshikun has a working solution in the mantis repo, and gathering the FOVs into a plate is very useful feature.

Thanks @ziw-liu.

mattersoflight assigned ziw-liu Jun 3, 2023

ziw-liu mentioned this issue Jun 6, 2023

Gather FOVs into plates #143

Merged

ziw-liu added enhancement New feature or request NGFF OME-NGFF (OME-Zarr format) labels Jun 6, 2023

ziw-liu closed this as completed Jul 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tools for parallel `slurm` processing across positions #141

Tools for parallel `slurm` processing across positions #141

talonchandler commented Jun 2, 2023

JoOkuma commented Jun 2, 2023

edyoshikun commented Jun 2, 2023

ziw-liu commented Jun 2, 2023

ziw-liu commented Jun 2, 2023

ziw-liu commented Jun 2, 2023

ziw-liu commented Jun 3, 2023 •

edited

Loading

talonchandler commented Jun 3, 2023 •

edited

Loading

ziw-liu commented Jun 3, 2023

JoOkuma commented Jun 6, 2023

talonchandler commented Jul 24, 2023

Tools for parallel slurm processing across positions #141

Tools for parallel slurm processing across positions #141

Comments

talonchandler commented Jun 2, 2023

JoOkuma commented Jun 2, 2023

edyoshikun commented Jun 2, 2023

ziw-liu commented Jun 2, 2023

ziw-liu commented Jun 2, 2023

ziw-liu commented Jun 2, 2023

ziw-liu commented Jun 3, 2023 • edited Loading

talonchandler commented Jun 3, 2023 • edited Loading

ziw-liu commented Jun 3, 2023

JoOkuma commented Jun 6, 2023

talonchandler commented Jul 24, 2023

Tools for parallel `slurm` processing across positions #141

Tools for parallel `slurm` processing across positions #141

ziw-liu commented Jun 3, 2023 •

edited

Loading

talonchandler commented Jun 3, 2023 •

edited

Loading