Format_instance caching leads to wrong data when reading the same file twice #245

dwpaley · 2020-10-22T01:34:51Z

This is an odd one.

On branch lazy_HDF5SaclaMPCCD:

from dxtbx.model.experiment_list import ExperimentListFactory

master_h5 = "/net/dials/raid1/dwpaley/dials/build/dials_data/image_examples/SACLA-MPCCD-Phase3-21528-5images.h5"

expts1 = ExperimentListFactory.from_filenames([master_h5])
expts1[0].imageset.get_mask(0)
#expts1[0].imageset.reader().nullify_format_instance() # this would fix it
expts2 = ExperimentListFactory.from_filenames([master_h5])

for e1, e2 in zip(expts1, expts2):
    print(
        e1.imageset.get_beam().get_wavelength() ==
        e2.imageset.get_beam().get_wavelength()
        )

Output:

False
True
True
True
True

In other words: if we construct an experiment list, call the get_mask() method of the imageset of the first experiment, and then construct another experiment list from the same file; then the first beam in the second experiment list is wrong.

A few notes:

This only happens if FormatHDF5SaclaMPCCD inherits from FormatMultiImageLazy. (i.e. if the beams are not read when constructing the imageset).
It only happens if we get the mask from the imageset of the first experiment. If we do expts1[1].imageset.get_mask(0), all the beams match.
Out of the 13 get_xxx methods in expts1[0].imageset, only get_corrected_data, get_raw_data and get_mask cause this problem.
There have been similar problems in the past; see the two places where nullify_format_instance is called in test_HDF5SaclaMPCCD.py.
I don't know of a real application where this causes a problem.

For now I'm adding an xfailing test on the aforementioned branch lazy_HDF5SaclaMPCCD.

Thanks @phyy-nx for discussing this at length.

The text was updated successfully, but these errors were encountered:

See dxtbx issue 245: #245

* Add xfailing test of format_instance caching See dxtbx issue 245: #245 * Account for lazy change in test_single_file_indices (includes now using lazy explicitly to get more test coverage) * Fix test_single_file_indices again libtbx.pytest test_imageset.py::test_single_file_indices was giving different result than libtbx.pytest test_imageset.py. Needed to add nullify_format_instance and then update the test to the right value for a fresh load of the data. Co-authored-by: Aaron Brewster <[email protected]>

Fixes #245

dwpaley added a commit that referenced this issue Oct 22, 2020

Add xfailing test of format_instance caching

a701a40

See dxtbx issue 245: #245

ndevenish mentioned this issue Jan 8, 2021

Make FormatHDF5SaclaMPCCD lazy #227

Merged

ndevenish mentioned this issue Jan 12, 2021

Strip out calls to nullify_format_instance dials/dials#1542

Merged

rjgildea mentioned this issue Jan 28, 2021

Ensure format instance is cached when first read #289

Merged

rjgildea added a commit that referenced this issue Jan 28, 2021

This test is no longer xfailing \o/

08245b0

Fixes #245

rjgildea closed this as completed in 7c1b2bf Jan 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Format_instance caching leads to wrong data when reading the same file twice #245

Format_instance caching leads to wrong data when reading the same file twice #245

dwpaley commented Oct 22, 2020 •

edited

Loading

Format_instance caching leads to wrong data when reading the same file twice #245

Format_instance caching leads to wrong data when reading the same file twice #245

Comments

dwpaley commented Oct 22, 2020 • edited Loading

dwpaley commented Oct 22, 2020 •

edited

Loading