Implement a schema for configs #75

talmo · 2024-08-14T20:22:51Z

Overview

We'd like to implement a set of data classes (using attrs) that enable schema-based validation of all of the config fields parsed by OmegaConf.

Background

In core SLEAP, we rolled our own config system (sleap.nn.config) based on attrs classes, cattr for serialization/deserialization to/from dicts/JSON, and some protobuffer inspired validation utilities.

Because attrs classes don't provide the full featureset we wanted for config management, we ended up implementing a lot of those ourselves (bad). For example, enabling string-based dot-config subfield access resulted in the (likely haunted) ScopedKeyDict. To ensure only a single subfield of a class is set, we implemented an attrs version of protobuffer's oneof. More advanced validation is all over the place.

Here, we switched to using OmegaConf as a batteries-included and more standard config library. It's based around using YAML (which makes for more readable and human-editable config files than the more punishing and markup-demanding JSON).

Right now, however, while we've documented all the fields, these are not enforced or validated at runtime.

The solution is to use a schema that specifies the field names, types, and other properties about them to enable validation. After investigation, we found that OmegaConf enables this through Structured Configs, which essentially take dataclass or attrs classes as input in order to validate a given set of OmegaConf inputs (dicts or yaml).

Plan

PR 1: Basic functionality

Migrate over the attrs config classes from sleap.nn.config, starting with TrainingJobConfig and moving down the hierarchy. These should get migrated to a submodule under sleap-nn/sleap_nn/config.
Update class definitions to new attrs API
Replace cattr serialization with OmegaConf
Replace the functionality of the oneof decorator with OmegaConf-based routines if possible (or retain oneof if needed)
Figure out how to implement cross-field validation for linked attributes (e.g., max stride in the backbone is constrained by the max stride of all the heads, and the other way around)
- This is currently handled in ScopedKeyDict in a pretty ad hoc way [1] [2]
- Perhaps use Interpolation with Resolvers?

PR 2: Revise fields

Review old vs new config fields and figure out which ones we should deprecate/rename/etc.
Consider a reorganization of the fields to consider better separation of user-defined values as compared to auto-generated ones
- Currently we handle this by saving out an initial_config.json versus a training_config.json, the latter of which has auto-populated values, but is a lot less convenient to work with.
- Some fields should probably not be in the config if they are better specified via the CLI or API (example: ZMQ port for training progress monitor). If we think there's a use case for keeping them in the config, then we should make the hierarchy clear (CLI > API > config?).
- Some fields just store metadata that's useful later, but should not be changed by the user (example: skeleton)
Implement a versioning system for the schema so we can better support backwards compatibility
- Ideally, this should also be robust to forward-compatibility, e.g., newer fields should be ignored in older versions

PR 3: Integration

Update usage across code
Implement a legacy config importer (Legacy config support #63)

PR 4: Presets

Re-design how we implement presets
- If config schema changes, all the files need to be updated
- No runtime validation if they're not Python objects -- would like to have some config presets defined as pure Python classes that can then be configured further (e.g., UNetMediumRFConfig --> pre-filled UNetBackboneConfig with medium RF values)
- Right now, we have a sprawl of configs because we need to create a config file for every combination of config presets (e.g., backbone X head type). It would be great to have a more modular way to define these and combine them as needed.
- Some fields are defined by the dataset, but these are inseparable from the rest of the config fields.
- (Some of this should probably be handled in PR 2)
Replicate the example configs from core SLEAP (sleap/sleap/training_profiles)

Example desired API:

import sleap_nn as snn
import sleap_io as sio

labels = sio.load_file("labels.pkg.slp")

cfg = snn.make_config(data=labels, backbone="unet_medium", model="centroid")

snn.train(cfg)  # creates a Trainer from the cfg and runs it

Other possible APIs for composability:

# Specifying splits flexibly
cfg = snn.make_config(data={"train": sio.load_slp("train.pkg.slp"), "val": sio.load_slp("val.pkg.slp")}, backbone="unet_medium", model="centroid")

# Customizing presets
cfg.optimization.epochs = 5

# Composing sub-configs
cfg = snn.make_config(
    data=labels,
    backbone=snn.config.UNetMediumConfig(filters=32),
    model="centroid",
)

# Or from files
cfg = snn.make_config(
    data=labels,
    model=snn.load_config("my/previous/trained/model"),  # if backbone is not specified, pulls it from that config
)

# For transfer learning:
cfg = snn.make_config(
    data=labels,
    model=snn.load_config("my/previous/trained/model"),
    backbone="transfer",

# More control:
cfg = snn.make_config(
    data=labels,
    model=snn.load_config("my/previous/trained/model"),
    backbone=snn.config.TransferConfig(freeze="encoder", encoder_feature_layers=["block0/relu", "block1/relu"]),

Related issues

The text was updated successfully, but these errors were encountered:

talmo mentioned this issue Oct 1, 2024

LitData Refactor PR4: Integrate LitData with ModelTrainer class #94

Merged

gqcpm mentioned this issue Oct 9, 2024

Implement Omegaconfig PR1: basic functionality #97

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement a schema for configs #75

Implement a schema for configs #75

talmo commented Aug 14, 2024 •

edited by gitttt-1234

Loading

Implement a schema for configs #75

Implement a schema for configs #75

Comments

talmo commented Aug 14, 2024 • edited by gitttt-1234 Loading

Overview

Background

Plan

PR 1: Basic functionality

PR 2: Revise fields

PR 3: Integration

PR 4: Presets

Related issues

talmo commented Aug 14, 2024 •

edited by gitttt-1234

Loading