You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m not 100% sold on MHC-specific loaders being baked into the repo like they are. Possibly they should be moved into a subdirectory/module and stay in this repo, possibly they should be moved to a separate repo.
In custom_datasets.py, load_mhc_libs() relies on the existence of prepro_channels.npy in the data directory. It looks like load_mars_big() needs ccs_channels.npy too. Regardless of whether they stay in this repo or move, they shouldn’t rely on the existence of otherwise undocumented data files. The question is whether they belong with the data (in a documented way) or the loader.
Architecturally, a plugin pattern definitely fits both public datasets and their custom loader functions (and possibly a way to bundle them together), but I think separate repos are overkill without a better use case. Focusing on the loaders:
Let’s define a place to add loader modules and some code to slurp in everything it finds there. backend/loaders/ makes sense to me but I could argue for other places.
Move the MHC loaders into a module there. Be clever with .gitignore so other things in the loaders directory are excluded from the repo.
Move the .npy files into that module (subdir for supporting data?).
There’s also room for a change in datasets.yml to make the files: key a little clearer about what’s going on instead of baking those architectural decisions into the loaders as well (these types of data require two lines and the second is always the metadata file, these require one and assume the metadata filename, &c.), and it would make sense to at least consider that before we declare this task done. This could easily end up being a major API overhaul and split out to its own project.
The text was updated successfully, but these errors were encountered:
I’m not 100% sold on MHC-specific loaders being baked into the repo like they are. Possibly they should be moved into a subdirectory/module and stay in this repo, possibly they should be moved to a separate repo.
In
custom_datasets.py
,load_mhc_libs()
relies on the existence ofprepro_channels.npy
in the data directory. It looks likeload_mars_big()
needsccs_channels.npy
too. Regardless of whether they stay in this repo or move, they shouldn’t rely on the existence of otherwise undocumented data files. The question is whether they belong with the data (in a documented way) or the loader.Architecturally, a plugin pattern definitely fits both public datasets and their custom loader functions (and possibly a way to bundle them together), but I think separate repos are overkill without a better use case. Focusing on the loaders:
backend/loaders/
makes sense to me but I could argue for other places..gitignore
so other things in the loaders directory are excluded from the repo..npy
files into that module (subdir for supporting data?).There’s also room for a change in
datasets.yml
to make thefiles:
key a little clearer about what’s going on instead of baking those architectural decisions into the loaders as well (these types of data require two lines and the second is always the metadata file, these require one and assume the metadata filename, &c.), and it would make sense to at least consider that before we declare this task done. This could easily end up being a major API overhaul and split out to its own project.The text was updated successfully, but these errors were encountered: