Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format_instance caching leads to wrong data when reading the same file twice #245

Closed
dwpaley opened this issue Oct 22, 2020 · 0 comments
Closed

Comments

@dwpaley
Copy link

dwpaley commented Oct 22, 2020

This is an odd one.

On branch lazy_HDF5SaclaMPCCD:

from dxtbx.model.experiment_list import ExperimentListFactory

master_h5 = "/net/dials/raid1/dwpaley/dials/build/dials_data/image_examples/SACLA-MPCCD-Phase3-21528-5images.h5"

expts1 = ExperimentListFactory.from_filenames([master_h5])
expts1[0].imageset.get_mask(0)
#expts1[0].imageset.reader().nullify_format_instance() # this would fix it
expts2 = ExperimentListFactory.from_filenames([master_h5])

for e1, e2 in zip(expts1, expts2):
    print(
        e1.imageset.get_beam().get_wavelength() ==
        e2.imageset.get_beam().get_wavelength()
        )

Output:

False
True
True
True
True

In other words: if we construct an experiment list, call the get_mask() method of the imageset of the first experiment, and then construct another experiment list from the same file; then the first beam in the second experiment list is wrong.

A few notes:

  • This only happens if FormatHDF5SaclaMPCCD inherits from FormatMultiImageLazy. (i.e. if the beams are not read when constructing the imageset).
  • It only happens if we get the mask from the imageset of the first experiment. If we do expts1[1].imageset.get_mask(0), all the beams match.
  • Out of the 13 get_xxx methods in expts1[0].imageset, only get_corrected_data, get_raw_data and get_mask cause this problem.
  • There have been similar problems in the past; see the two places where nullify_format_instance is called in test_HDF5SaclaMPCCD.py.
  • I don't know of a real application where this causes a problem.

For now I'm adding an xfailing test on the aforementioned branch lazy_HDF5SaclaMPCCD.

Thanks @phyy-nx for discussing this at length.

dwpaley added a commit that referenced this issue Oct 22, 2020
ndevenish pushed a commit that referenced this issue Jan 11, 2021
* Add xfailing test of format_instance caching
  See dxtbx issue 245: #245
* Account for lazy change in test_single_file_indices
  (includes now using lazy explicitly to get more test coverage)
* Fix test_single_file_indices again
  libtbx.pytest test_imageset.py::test_single_file_indices was giving different result than libtbx.pytest test_imageset.py. 
  Needed to add nullify_format_instance and then update the test to the right value for a fresh load of the data.

Co-authored-by: Aaron Brewster <[email protected]>
rjgildea added a commit that referenced this issue Jan 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant