Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethinking of external assets #236

Closed
danielballan opened this issue Jan 13, 2023 · 3 comments · Fixed by #272
Closed

Rethinking of external assets #236

danielballan opened this issue Jan 13, 2023 · 3 comments · Fixed by #272
Assignees

Comments

@danielballan
Copy link
Member

Notes from a chat with @coretl @tacaswell @callumforrester @DiamondJoseph @tizayi

The motivation here is to improve the situation for using the same detector in fly scans and step scans, with efficient access to data. Lots more to discuss before we implement these disruptive ideas, so nobody panic. :- )


  • Resource and Datum documents will be deprecated (!).
  • As now, the describe() method continues to indicate which keys are backed by external data. (Perhaps we use a new word for this new scheme.)
  • The read() method will simply omit keys that are backed by external data. The RunBundler can verify that non-external keys are present and external keys (as declared by descriptor) are not present.
  • If there are any external keys, RunBundler calls collect_asset_docs(). This should include a Partition document with some parameters that will be used to associcate a slice of underlying data with a (length-1) slice of Events.
  • It should include a Resource2 (gotta name this...) that should have a mimetype and a dict of arbitrary parameters. The only restrtiction on paramters is that they are jSON-serializable.
  • The Partition will also have index_start, index_stop corresponding to a slice in the underlying storage.
handler_class = registry[mimetype]  # i.e. `image/tiff` or `application/x-nexus-something`
handler = handler_class(**parameters)  # e.g. filename and whatever else goes here
# contrast to with datum: handler(**datum_kwargs)
handler[index_start:index_stop]
@coretl
Copy link
Contributor

coretl commented Jan 18, 2023

I sketched out the old and new (with a single event, multiple events would make more use of the slice object)

Stream model suggestions excalidraw(2)

Here's a suggestion as to how it could be implemented.

  • StreamDatum
    • no datum_kwargs
    • no stream_name, instead a pointer to the Descriptor it provides some or all of the data_keys
    • data_keys list to show which data_keys of the Descriptor it provides
    • seq_nums is a slice object showing the Event numbers it corresponds to
    • indexes is a slice object passed to the StreamResource handler so it can hand back data and timestamps
  • Event and EventPage
    • external data_keys will be missing from data and timestamps

The Event side of the diagram then becomes optional, if all detectors write external files then there will be no Event documents

@tcaswell @danielballan shall we begin writing this?

@evalott100 evalott100 moved this from Todo to In Progress in I22 Bluesky Jan 19, 2023
@danielballan
Copy link
Member Author

Yes, I think so. I'd like to do this in tandem with @DiamondJoseph's NeXus writers and a sketch of our proposed new storage model for Databroker to validate that everything works out the way we expect.

@coretl
Copy link
Contributor

coretl commented Jan 24, 2023

Can delete stream_names from StreamResource too

@evalott100 evalott100 moved this from In Progress to Todo in I22 Bluesky Feb 21, 2023
@evalott100 evalott100 moved this from Todo to In Progress in I22 Bluesky Feb 28, 2023
@rosesyrett rosesyrett moved this from In Progress to Blocked in I22 Bluesky May 24, 2023
@evalott100 evalott100 moved this from Blocked to In Review in I22 Bluesky Jun 26, 2023
@rosesyrett rosesyrett moved this from In Review to In Progress in I22 Bluesky Jul 5, 2023
@evalott100 evalott100 moved this from In Progress to In Review in I22 Bluesky Jul 19, 2023
@github-project-automation github-project-automation bot moved this from In Review to Done in I22 Bluesky Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants