Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate Onda.Samples in favor of interoperability with generic array packages? #77

Open
jrevels opened this issue Apr 8, 2021 · 3 comments

Comments

@jrevels
Copy link
Member

jrevels commented Apr 8, 2021

a secret window into the mysterious caverns of the Beacon Slack:

lol

It always feels like there's lots of overlap between Samples and more generic named-array packages like AxisKeys/AxisArrays/etc. Would it be a good idea to refactor Samples to be a wrapper around AxisKeys, or define convenience constructors for translating between them, or replace Samples entirely, or something else altogether that allows Onda (and downstream callers) to more easily reuse/interface the more generic functionality provided by AxisKeys?

It seems like answering this question is a matter of clearly stating the "responsibility" of the Samples API layer and possibly formulating our goals around that.

To that end, it might be useful to start by roughly describing the manner in which Onda's functionality is currently layered:

  1. Tabular recording metadata utilities (Arrow.jl/Tables.jl stuff)
  2. (built on layer 1) Overloadable sample data (de)serialization mechanisms (the AbstractLPCMFormat API)
  3. (built on composition of layer 2 + 3) The Samples API.

The current core responsibilities of the Samples API layer are...

  1. ...to provide a minimal "load/store unit" for Onda-formatted sample data (in practice - associates SampleInfo with a corresponding sample data matrix)
  2. ...to provide LPCM encode/decode functionality. "Why here instead of the AbstractLPCMFormat layer?", you might ask. The answer is that the Samples layer is the "lowermost" layer at which encode/decode can be implemented generically w/o knowing anything about the target/source serialization format.
  3. ...to provide an overload point for any other clear/obvious specialized functionality that arises from associating SampleInfo with a data matrix, like specialized time span/channel indexing.

Given all of these points in combination, it seems like we can at least say we won't be able to get rid of Samples fully, unless a) we feel like everything we'd ever want out of point 3 could be achieved already with a more generic array type and b) we'd feel fine forcing callers to pass around data and info into most API functions as separate arguments (and having callers keep track on their own of whether or not data is encoded/decoded).

So, assuming we wouldn't don't get rid of Samples entirely, how could we use these other packages to make Samples better? It seems like the primary annoyance that they might help with is that Samples not an AbstractMatrix despite point 3. It's like telling callers "yeah, Samples isn't an AbstractMatrix, but here, if you want special matrix-y indexing features, wrap your AbstractMatrix in it!" Sounds weirdly inconvenient, but the reasoning kind of makes compositional sense - if you DID implement it as a full AbstractMatrix, then callers would still probably have to assume (outside a few special cases that are already covered) that most AbstractArray operations would cause Samples inputs/outputs to be unwrapped anyway (i.e. the transform on the Samples data would not have a corresponding sensible transform on its SamplesInfo). Thus, we just keep the distinction very clear/explicit, forcing callers to unwrap/rewrap themselves, so that they don't land in weird situations where they accidentally lose SamplesInfo along the way in an unexpected manner.

Does all of this imply that Onda should just provide explicit constructors to go between KeyedArray and Samples and be done with it? That seems like an easy enough thing to do. This brainstorming does make me want to play around with a Samples-less API (which would be beholden to a) and b) mentioned above), just to see how far it can go...

@ericphanson
Copy link
Member

https://github.com/JuliaAudio/SampledSignals.jl also deals with multichannel signals and implements array type methods, so could be worth a look to see if we want to borrow any of that design.

@jrevels jrevels changed the title AxisKeys <-> Onda.Samples? deprecate Onda.Samples in favor of interoperability with generic array packages? Aug 21, 2023
@jrevels
Copy link
Member Author

jrevels commented May 15, 2024

relevant to this topic - it's worth noting that by this point, the OSS ecosystem that emerged from the geoscience space (xarray + zarr + kerchunk) basically solves most of the same problems as Onda's sample data manipulation functionality, but in a drastically more generalized domain-agnostic fashion (n-dimensional labeled-dimension array storage, pluggable file formats/codecs, pluggable storage systems)

If these tools had already existed way back in early 2019, Onda.jl wouldn't have consisted of anything more than the signal/annotation schemas themselves

This is probably an indication that the direction outlined by this issue is the right eventual path in theory

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants