Skip to content

Declarative programmable storage work on infrastructure (HDF5 and Ceph for now)

License

Notifications You must be signed in to change notification settings

drin/decl-mercantile

Repository files navigation

DeclMercantile

A repository for declarative programmable storage work by Aldrin. I named this repository decl-mercantile as a related, but different, project with respect to DeclStore.

Overview

There has been some flux in the task priority for declarative programmable storage. This project (or at least, the part that this repository maintains) will first focus on how existing gene expression data can be serialized in a format for skyhook to ingest, and then the performance profiles of how skyhook can query the data.

Architecture

The directory layout of this repository is as follows (at the time of this writing):

3 directories, 3 files
.
├── code
│   ├── poetry.lock
│   ├── pyproject.toml
│   ├── README.rst
│   ├── skyhookdm_singlecell
│   ├── tests
│   └── toolbox
├── LICENSE
├── README.md
├── scripts
│   └── shell-utils
├── setup.sh
└── submodules
    └── ceph-container

8 directories, 6 files

Code

The code directory contains a poetry project, skyhookdm_singlecell. This project currently contains simple code for:

  • Parsing some HCA data
  • Parsing another simple gene expression format
  • Serializing gene expression data in Arrow format with extra Skyhook metadata
    • This code is python and leans heavily on pyarrow for serializing in Arrow format.

The code will eventually expand to include anything lower-level related to my research and declarative programmable storage. For anything higher-level, the repository XHCA will be used.

Submodules

Though a work in progress (especially since I have little experience with submodules), this is where git submodules are. Initially, I wanted to have ceph be a submodule so that this repository could house any modifications to ceph and still be buildable, but I have not pushed that aspect forward in awhile.

Scripts

Here you may find standalone scripts that provide some independent functionality. At the moment, there is a script based on xweichu's ceph build script for cloudlab. I have not tested my version (scripts/shell-utils/run-ceph-container.fish), but that will hopefully happen in the near-ish future.

About

Declarative programmable storage work on infrastructure (HDF5 and Ceph for now)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages