Skip to content

Conversation

tomsail
Copy link
Contributor

@tomsail tomsail commented Oct 8, 2025

For reference: #57

@tomsail
Copy link
Contributor Author

tomsail commented Oct 8, 2025

Dask tests fail probably because:

  • read_var_in_frame is calling concurrently the same file from multiple threads or processes.
  • and/or xr.map_blocks is running many tasks concurrently and those tasks are sharing the same self.slf_reader object inside SelafinLazyArray.
  • and/or the reader’s internal file handle / buffer is being corrupted by simultaneous reads, yielding garbled or truncated arrays.

I will try it later, but a lead to fix this would be to open a fresh reader per call.

like so:

class SelafinLazyArray(BackendArray):
    def __init__(self, slf_reader_or_path, var, dtype, shape):
        # Accept either a reader instance or a path/reader-factory
        self.slf_reader = slf_reader_or_path
        self.var = var
        self.dtype = dtype
        self.shape = shape

    def _open_reader(self):
        # If self.slf_reader is already a path or factory, open a fresh reader
        if isinstance(self.slf_reader, str):
            return SelafinReader(self.slf_reader)   # <-- adapt to your reader constructor
        # If provided a reader instance, try to get its path attribute
        if hasattr(self.slf_reader, "filepath"):
            return SelafinReader(self.slf_reader.filepath)
        # otherwise, fallback: assume 'slf_reader' is already a lightweight factory
        return self.slf_reader

    def _raw_indexing_method(self, key):
        ...
        for it, t in enumerate(time_indices):
            t_int = int(t)
            # open a fresh reader for this task
            reader = self._open_reader()
            try:
                temp = np.asarray(reader.read_var_in_frame(t_int, self.var))
            finally:
                # close if the reader supports close()
                if hasattr(reader, "close"):
                    try:
                        reader.close()
                    except Exception:
                        pass
            ...

@lucduron what do you think?
Is SelafinReader light enough to be concurrently opened on multiple time step and/or chunks?
My question is this object light enough, or does it contain heavy data that might be a problem for very big mesh handling? (I'm thinking about the IPOBO array for instance)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant