Skip to content

feat: multimodal CPT (raw image+text continued pre-training)#8

Closed
thad0ctor wants to merge 16 commits into
mainfrom
multimodal-cpt
Closed

feat: multimodal CPT (raw image+text continued pre-training)#8
thad0ctor wants to merge 16 commits into
mainfrom
multimodal-cpt

Conversation

@thad0ctor

@thad0ctor thad0ctor commented Apr 24, 2026

Copy link
Copy Markdown
Owner

Squashed re-submission of the work originally proposed in #6 (closed). Single commit on top of main so CodeRabbit can re-review against a clean history.

Summary

Adds a new pretraining_dataset: [{type: multimodal_pretrain}] path so users can continue pre-training a VLM directly on {text, images} JSONL rows — no chat template, no conversational scaffolding. Targets OCR/transcription corpora where every row is a tight (image, target_text) pair and any user/assistant framing would pollute the learned signal.

  • Design: deferred collation — encoder pre-tokenizes text for multipack but keeps raw text + image paths through .map(); collator re-runs processor(text=..., images=...) on the full batch. Only robust way to handle the 4+ distinct pixel_values layouts across VLM families.
  • Supported (v1): LLaVA-1.5, SmolVLM/SmolVLM2, Qwen2-VL, Qwen2.5-VL, Qwen3-VL, Gemma-3, Gemma-4 (E2B + E4B). Empirically verified via processor-contract probes + forward-pass runs.
  • Rejected with clear errors: Mllama (cross-attention, not in-stream), Pixtral (mistral_common), InternVL (no pixel_values from AutoProcessor).

Safety gates (all enforced at config-load / startup)

  • sample_packing: true → rejected (breaks placeholder/pixel alignment)
  • chat_template set → rejected (defeats CPT purpose)
  • processor_type unset → rejected
  • Incompatible processor class → rejected (isinstance + MRO walk; catches subclasses)
  • Per-row: count(placeholder_id) != len(images) → rejected (catches silent-failure on LLaVA/Qwen)
  • Placeholder autodetect failure → clear error with image_token: override hint

Security hardening

  • Path traversal containment via realpath + os.path.commonpath (root-base safe), O_NOFOLLOW fd
  • Explicit pixel-count decompression-bomb guard (no thread-unsafe warnings.catch_warnings)
  • GIF/TIFF multi-frame rejection
  • Per-row image count cap (default 32)
  • Case-insensitive scheme denylist (http/https/ftp/ftps/file/data + UNC)
  • NUL-byte rejection
  • Type guards: _mm_text and each image path must be str
  • image_token override must be a registered special token (not BPE-fallback)
  • return_tensors locked to "pt" at construction (downstream uses in-place torch ops)
  • Error messages log only basenames; full paths at DEBUG only

Label masking

Image-family token ids (placeholder + wrappers like <|vision_start|>/<end_of_image>) are auto-masked to -100. Without this, loss is ~10× higher empirically and training diverges — the model is forced to predict visual-patch token ids that don't correspond to predictable text.

Test plan

  • 32 unit tests pass (strategy gates, streaming encoder, collator, validation, security)
  • End-to-end axolotl trainer runs on SmolVLM-500M (train_loss: 1.929, 2 steps)
  • Processor-contract verified against 10 VLMs
  • Forward-pass verified on SmolVLM-500M, Gemma-4 E2B, Gemma-4 E4B

History

Previous incremental history (7 commits, including three rounds of CodeRabbit-driven security hardening and pretraining_dataset multi-entry gating fixes) was squashed into a single commit for review clarity.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Multimodal pretraining path enabling training with raw images and text without chat templates
    • Automatic image token detection and validation against image counts
    • Image loading with security protections (path traversal blocking, remote URL rejection, pixel limits)
  • Documentation

    • Added multimodal pretraining training guide with dataset and configuration requirements
  • Tests

    • Comprehensive test coverage for multimodal tokenization, streaming, and configuration validation

thad0ctor and others added 2 commits April 24, 2026 11:05
…daries

Fixes silent ignoring of `cfg.train_on_inputs` / `cfg.roles_to_train` /
`cfg.train_on_eos` in the multimodal training path. Before this branch,
only Gemma 3n honored these knobs; every other VLM trained on the full
sequence regardless of config. Also adds `cfg.role_boundaries` YAML
override so users can declare per-role markers without subclassing.

What changed
------------
- `ProcessingStrategy` gains a declarative boundary scanner. Each
  strategy declares per-role start/end markers via
  `_build_role_boundaries`; the shared scanner honors
  `train_on_inputs` / `roles_to_train` / `train_on_eos` (incl. "last").
- New per-template strategies: Gemma 4, Llama 3.2 Vision, Llama 4,
  Pixtral, Mistral V7 Tekken.
- Refactored: Gemma 3 (previously no role masking), Gemma 3n
  (previously ad-hoc scanner, now shared).
- Strategies whose boundary tokens couldn't be verified offline
  (Voxtral, SmolVLM2, Mistral3, InternVL, GLM4V, llava/lfm2vl
  fallback) retain legacy behavior and emit a one-shot warning. Users
  can enable masking on them via `cfg.role_boundaries`.
- Pixtral / Mistral V7 Tekken correctly handle the shared `[/INST]`
  token between user-end and assistant-start via `include_end=False`
  + scanner rewind.

See `docs/multimodal_assistant_mask.md` for the full audit table,
root-cause analysis, and design rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a new `pretraining_dataset: [{type: multimodal_pretrain}]` path so
users can continue pre-training a VLM directly on `{text, images}`
JSONL rows — no chat template, no conversational scaffolding. Targets
OCR/transcription corpora where every row is a tight `(image,
target_text)` pair and any user/assistant framing would pollute the
learned signal.

Design
------
Deferred collation — encoder pre-tokenizes text for multipack but
keeps raw text + image paths through `.map()`; collator re-runs
`processor(text=..., images=...)` on the full batch. Only robust way
to handle the 4+ distinct `pixel_values` layouts across VLM families.

Supported (v1): LLaVA-1.5, SmolVLM/SmolVLM2, Qwen2-VL, Qwen2.5-VL,
Qwen3-VL, Gemma-3, Gemma-4 (E2B + E4B). Rejected with clear errors:
Mllama (cross-attention, not in-stream), Pixtral (mistral_common),
InternVL (no pixel_values from AutoProcessor).

Safety gates (enforced at config-load / startup):
- `sample_packing: true` rejected (breaks placeholder/pixel alignment)
- `chat_template` rejected (defeats CPT purpose)
- `processor_type` unset rejected
- Incompatible processor class rejected (isinstance + MRO walk)
- Per-row `count(placeholder_id) != len(images)` rejected
- Placeholder autodetect failure: clear error with override hint

Security hardening:
- Path traversal containment via `realpath` + `os.path.commonpath`
  (root-base safe), `O_NOFOLLOW` fd
- Explicit pixel-count decompression-bomb guard
- GIF/TIFF multi-frame rejection
- Per-row image count cap (default 32)
- Case-insensitive scheme denylist (http/https/ftp/ftps/file/data
  + UNC), NUL-byte rejection
- Type guards on `_mm_text` and each image path
- `image_token` override must be a registered special token
- Error messages log only basenames; full paths at DEBUG only

Label masking: image-family token ids (placeholder + wrappers like
`<|vision_start|>`, `<end_of_image>`) auto-masked to -100. Without
this, loss is ~10× higher empirically and training diverges — the
model is forced to predict visual-patch token ids that don't
correspond to predictable text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Apr 24, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c2f6f63d-77da-46db-a11a-0183d0c1f278

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces comprehensive multimodal CPT (Causal Pretrain Training) support to Axolotl, enabling raw image-conditioned text generation without chat templates. Adds new tokenization strategy, data collator, streaming support, configuration schema extensions, validation gates, and extensive test coverage across multiple subsystems.

Changes

Cohort / File(s) Summary
Documentation
docs/multimodal.qmd
Introduces multimodal CPT training path with JSONL dataset requirements, YAML config example, rejection gates for incompatible options, and per-row validation of image placeholder counts.
Tokenization & Strategy
src/axolotl/prompt_strategies/multimodal_pretrain.py
New strategy for CPT-style tokenization with dynamic processor-compatibility detection, image placeholder token autodetection/override validation, image-family token ID masking, and 1:1 chunking semantics with image-path alignment.
Core Training Infrastructure
src/axolotl/core/builders/causal.py
Adds multimodal CPT detection and configuration helpers; introduces _build_mm_pretrain_collator to construct multimodal collator during batch (non-eval) processing.
Data Processing & Collation
src/axolotl/utils/collators/mm_pretrain.py, src/axolotl/utils/data/sft.py, src/axolotl/utils/data/streaming.py
Multimodal collator with image file loading, path traversal/URL protections, processor-based input regeneration, and label masking; streaming dataset support forwarding processor and multimodal config fields; streaming encoder for multimodal with image-placeholder validation.
Configuration & Validation
src/axolotl/utils/schemas/datasets.py, src/axolotl/utils/schemas/validation.py
Extends PretrainingDataset schema with multimodal fields (multimodal, image_column, image_base_dir, image_token); adds validator enforcing single pretraining dataset, processor_type requirement, rejecting sample_packing/chat_template, and disabling remove_unused_columns.
Test Infrastructure & Fixtures
tests/conftest.py, tests/prompt_strategies/test_multimodal_pretrain.py, tests/test_multimodal_streaming.py, tests/utils/schemas/validation/test_multimodal_cpt.py
Session-scoped SmolVLM fixture for processor/tokenizer tests; comprehensive suite for tokenization strategy (autodetection, override validation, processor compatibility, placeholder/image count matching); end-to-end streaming tests (output structure, label semantics, path traversal/URL/NUL-byte protections); configuration validation tests for gates and field preservation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.62% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: multimodal CPT (raw image+text continued pre-training)' accurately and concisely describes the main feature being added—a new multimodal continued pre-training path. It is specific, clear, and reflects the primary change across the entire changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch multimodal-cpt

Comment @coderabbitai help to get the list of available commands and usage tips.

…daries

Fixes silent ignoring of `cfg.train_on_inputs` / `cfg.roles_to_train` /
`cfg.train_on_eos` in the multimodal training path. Before this branch,
only Gemma 3n honored these knobs; every other VLM trained on the full
sequence regardless of config. Also adds `cfg.role_boundaries` YAML
override so users can declare per-role markers without subclassing.

What changed
------------
- `ProcessingStrategy` gains a declarative boundary scanner. Each
  strategy declares per-role start/end markers via
  `_build_role_boundaries`; the shared scanner honors
  `train_on_inputs` / `roles_to_train` / `train_on_eos` (incl. "last").
- New per-template strategies: Gemma 4, Llama 3.2 Vision, Llama 4,
  Pixtral, Mistral V7 Tekken.
- Refactored: Gemma 3 (previously no role masking), Gemma 3n
  (previously ad-hoc scanner, now shared).
- Strategies whose boundary tokens couldn't be verified offline
  (Voxtral, SmolVLM2, Mistral3, InternVL, GLM4V, llava/lfm2vl
  fallback) retain legacy behavior and emit a one-shot warning. Users
  can enable masking on them via `cfg.role_boundaries`.
- Pixtral / Mistral V7 Tekken correctly handle the shared `[/INST]`
  token between user-end and assistant-start via `include_end=False`
  + scanner rewind.

See `docs/multimodal_assistant_mask.md` for the full audit table,
root-cause analysis, and design rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/axolotl/utils/data/streaming.py (1)

229-230: Optional: Use unpacking syntax for list concatenation.

Ruff suggests using [*list(...), item] instead of list(...) + [item] for cleaner unpacking. This is a minor style preference.

✨ Proposed change
-        ids = list(enc["input_ids"]) + [tokenizer.eos_token_id]
-        mask = list(enc["attention_mask"]) + [1]
+        ids = [*list(enc["input_ids"]), tokenizer.eos_token_id]
+        mask = [*list(enc["attention_mask"]), 1]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/axolotl/utils/data/streaming.py` around lines 229 - 230, Replace the list
concatenation using + with unpacking for clarity: construct ids as
[*enc["input_ids"], tokenizer.eos_token_id] and mask as [*enc["attention_mask"],
1] instead of list(enc["input_ids"]) + [tokenizer.eos_token_id] and
list(enc["attention_mask"]) + [1]; update the assignments for the variables ids
and mask accordingly to use the unpacking syntax while keeping the same
semantics.
src/axolotl/utils/schemas/validation.py (1)

1378-1381: Consider adding an INFO log when remove_unused_columns is auto-set.

Unlike similar mutations in check_eval_packing and check_mm_prepare, this one silently forces remove_unused_columns=False. Adding a log message would help users understand why this was set.

📝 Proposed change
         # Force-disable column stripping so the `images` and `_mm_text`
         # columns survive through to the collator.
         if data.get("remove_unused_columns") is not False:
+            LOG.info(
+                "setting `remove_unused_columns: false` for multimodal CPT "
+                "(required for images and _mm_text columns)"
+            )
             data["remove_unused_columns"] = False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/axolotl/utils/schemas/validation.py` around lines 1378 - 1381, When
auto-setting data["remove_unused_columns"] to False (the block that checks if
data.get("remove_unused_columns") is not False and then sets it to False to
preserve the images and _mm_text columns), add an INFO log entry so users see
why the mutation happened; use the module's logger (or the existing logger
variable) to log a concise message like "Auto-set remove_unused_columns=False to
preserve 'images' and '_mm_text' columns" and include the previous value from
data.get("remove_unused_columns") for context to match the logging behavior used
in check_eval_packing and check_mm_prepare.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/axolotl/prompt_strategies/multimodal_pretrain.py`:
- Around line 253-259: The code in multimodal_pretrain.py currently coerces
falsy images into an empty list (images = prompt.get(self.image_column) or []),
which lets values like ""/0/False bypass the isinstance check; change this to
explicitly check for None (use prompt.get(self.image_column) with no fallback,
then if images is None set images = [] to allow text-only samples, but if images
is not None validate it's a list/tuple) so non-list/tuple falsy values are
rejected; update the validation around images (referencing self.image_column and
variable images) and add a regression test that a row with images == "" raises
the ValueError.

---

Nitpick comments:
In `@src/axolotl/utils/data/streaming.py`:
- Around line 229-230: Replace the list concatenation using + with unpacking for
clarity: construct ids as [*enc["input_ids"], tokenizer.eos_token_id] and mask
as [*enc["attention_mask"], 1] instead of list(enc["input_ids"]) +
[tokenizer.eos_token_id] and list(enc["attention_mask"]) + [1]; update the
assignments for the variables ids and mask accordingly to use the unpacking
syntax while keeping the same semantics.

In `@src/axolotl/utils/schemas/validation.py`:
- Around line 1378-1381: When auto-setting data["remove_unused_columns"] to
False (the block that checks if data.get("remove_unused_columns") is not False
and then sets it to False to preserve the images and _mm_text columns), add an
INFO log entry so users see why the mutation happened; use the module's logger
(or the existing logger variable) to log a concise message like "Auto-set
remove_unused_columns=False to preserve 'images' and '_mm_text' columns" and
include the previous value from data.get("remove_unused_columns") for context to
match the logging behavior used in check_eval_packing and check_mm_prepare.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9ca02cb1-8ad2-4c67-8b7e-15cfd0712eb6

📥 Commits

Reviewing files that changed from the base of the PR and between 798c8fb and 494fe3b.

📒 Files selected for processing (12)
  • docs/multimodal.qmd
  • src/axolotl/core/builders/causal.py
  • src/axolotl/prompt_strategies/multimodal_pretrain.py
  • src/axolotl/utils/collators/mm_pretrain.py
  • src/axolotl/utils/data/sft.py
  • src/axolotl/utils/data/streaming.py
  • src/axolotl/utils/schemas/datasets.py
  • src/axolotl/utils/schemas/validation.py
  • tests/conftest.py
  • tests/prompt_strategies/test_multimodal_pretrain.py
  • tests/test_multimodal_streaming.py
  • tests/utils/schemas/validation/test_multimodal_cpt.py

Comment thread src/axolotl/prompt_strategies/multimodal_pretrain.py
@github-actions

github-actions Bot commented Apr 24, 2026

Copy link
Copy Markdown

📖 Documentation Preview:

Deployed on Netlify from commit e0f7923

thad0ctor and others added 12 commits April 24, 2026 11:21
- builders/causal.py: add inline NOTE that multi-dataset configs reuse
  the first dataset's masking knobs (roles_to_train / train_on_eos) for
  all datasets — heterogeneous per-dataset overrides are not supported
  in the MM path today.
- processing_strategies.py: annotate inner scanner helpers
  _match_prefix and _find_end with explicit types (Tensor, int,
  list[int] → bool / tuple[int, bool]) for readability.
- docs/multimodal_assistant_mask.md: renumber the "Commits on this
  branch" list to 1-7 consecutive (previously skipped 3).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. Schema rejected `train_on_eos: "none"` despite the scanner honoring it.
   `_VALID_TRAIN_ON_EOS` accepts "none" and the design doc lists it, but
   `SFTDataset.train_on_eos` was `Literal["all", "turn", "last"]`, so YAML
   users hit a pydantic ValidationError at config load. Added "none" to
   the Literal and updated the description.

2. `cfg.role_boundaries: []` had split-personality semantics: the strategy
   ctor treated it as "replace built-ins with empty" while the collator
   plumbing treated it as "unset", and both the design doc and the
   MultiModalConfig schema help text promised wholesale replacement for
   any set value. Aligned on opt-in semantics across all four surfaces —
   a non-empty list replaces built-ins wholesale; unset or `[]` falls back
   to built-ins. Rationale: honoring `[]` literally yields all-masked
   labels and zero gradient, which is almost always a typo or leftover
   rather than a deliberate user action. Users who want to disable role
   masking should unset the field or use `train_on_inputs: true`.

   Also sharpened the fallback one-shot warning for strategies without
   built-in boundaries: names the consequence ("only pad and media tokens
   are masked, every other token contributes to loss") and points users
   at `cfg.role_boundaries` + docs/multimodal_assistant_mask.md instead
   of "see axolotl/processing_strategies.py for how to declare
   boundaries."

Files:
- src/axolotl/utils/schemas/datasets.py: Literal adds "none"
- src/axolotl/processing_strategies.py: ctor truthiness check on
  role_boundaries_override; sharpened fallback warning
- src/axolotl/utils/schemas/multimodal.py: role_boundaries description
  now calls out opt-in + empty-list fallback semantics
- docs/multimodal_assistant_mask.md: same clarification in the Semantics
  block; updated the fallback-path detection paragraph to quote the new
  warning text
- tests/test_processing_strategies.py: +2 regressions
  (test_sft_dataset_schema_accepts_all_supported_train_on_eos_values,
  test_empty_role_boundaries_override_falls_back_to_builtin); 63/63 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>


Pre-commit failure: trailing newline missing on
docs/multimodal_assistant_mask.md (end-of-file-fixer hook).

Six CodeRabbit findings addressed:

1. Scanner: non-trainable role's end marker ignored ``include_end``.
   Under ``train_on_eos="all"``, the shared ``[/INST]`` token (user-end
   with ``include_end=False``, intentionally re-matched as assistant-start)
   leaked into loss via the user branch on Pixtral / Mistral V7 Tekken.
   Fix: gate the non-trainable branch on ``best_match.include_end`` to
   mirror the trainable branch.

2. Gemma3 ``boi_token`` lookup used ``tokenizer.special_tokens_map.get("boi_token")``,
   which never fires on real checkpoints (``special_tokens_map`` only
   holds HF's standard slots — bos/eos/pad/unk/...). Swap to direct
   attribute read ``getattr(tokenizer, "boi_token", None)``, matching
   what ``transformers.models.gemma3.processing_gemma3`` itself does.
   Updated the ``_gemma_tokenizer`` test fixture to mirror real-model
   shape so the test exercises the production code path.

3. GLM dispatcher only registered ``Glm46VProcessor`` (GLM-4.6V /
   GLM-4.7V). Real ``Glm4vProcessor`` (GLM-4V / GLM-4.1V) users fell
   through to the base fallback. Both processors ship identical
   media-token markers, so register both under the shared
   ``Glm4vProcessingStrategy`` with independent try/except import blocks.
   Updated class docstring. +2 dispatcher regressions.

4. Gemma3 ``process_labels`` hardcoded 262144 for the soft image token.
   Resolve dynamically via ``tokenizer.convert_tokens_to_ids("<image_soft_token>")``
   with unk-id guard; fall back to 262144 only if the string isn't in
   vocab. Mirrors ``Gemma4ProcessingStrategy.process_labels`` pattern.

5. ``build_collator`` was called twice per ``build()`` (eval + train
   passes), producing two identical ``MM collator: ...`` INFO banners on
   startup. Gate the log on ``is_eval=False`` so only the training pass
   emits it.

6. Removed unused ``_mistral_common_stub`` pytest fixture (13 refs → 0,
   always returned ``None``; the dispatcher already handles missing
   ``mistral_common`` via lazy import + ``try/except``). Added
   ``test_scanner_train_on_eos_all_with_non_trainable_include_end_false``
   — a focused scanner-level lock-in for finding #1, independent of any
   specific VLM strategy.

Test count: 63 → 68 passing. Local ``pre-commit run --all-files`` green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
feat: systemic multimodal assistant-only loss masking + cfg.role_boundaries
…trings

- Scanner perf: convert labels[i] to a Python list once per row so
  _match_prefix / _find_end operate on list slices instead of
  re-materializing Tensor slices via .tolist() on every probe. Cuts
  O(n*boundaries) CPython↔C boundary crossings per batch.
- Markdown lint (MD001, MD040): promote two h3 section headings to h2
  under the h1; add `text` language to the verify-at-runtime fenced block.
- Shorten verbose comments/docstrings added in recent commits to
  bare-minimum "why" notes matching the repo's existing style.

68/68 tests, 8/8 pre-commit hooks still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bring in d76d66e chore(mm-mask): hoist .tolist() out of scanner;
shorten comments/docstrings — the single commit landed after PR #7
was merged.
Adds a new `pretraining_dataset: [{type: multimodal_pretrain}]` path so
users can continue pre-training a VLM directly on `{text, images}`
JSONL rows — no chat template, no conversational scaffolding. Targets
OCR/transcription corpora where every row is a tight `(image,
target_text)` pair and any user/assistant framing would pollute the
learned signal.

Design
------
Deferred collation — encoder pre-tokenizes text for multipack but
keeps raw text + image paths through `.map()`; collator re-runs
`processor(text=..., images=...)` on the full batch. Only robust way
to handle the 4+ distinct `pixel_values` layouts across VLM families.

Supported (v1): LLaVA-1.5, SmolVLM/SmolVLM2, Qwen2-VL, Qwen2.5-VL,
Qwen3-VL, Gemma-3, Gemma-4 (E2B + E4B). Rejected with clear errors:
Mllama (cross-attention, not in-stream), Pixtral (mistral_common),
InternVL (no pixel_values from AutoProcessor).

Safety gates (enforced at config-load / startup):
- `sample_packing: true` rejected (breaks placeholder/pixel alignment)
- `chat_template` rejected (defeats CPT purpose)
- `processor_type` unset rejected
- Incompatible processor class rejected (isinstance + MRO walk)
- Per-row `count(placeholder_id) != len(images)` rejected
- Placeholder autodetect failure: clear error with override hint

Security hardening:
- Path traversal containment via `realpath` + `os.path.commonpath`
  (root-base safe), `O_NOFOLLOW` fd
- Explicit pixel-count decompression-bomb guard
- GIF/TIFF multi-frame rejection
- Per-row image count cap (default 32)
- Case-insensitive scheme denylist (http/https/ftp/ftps/file/data
  + UNC), NUL-byte rejection
- Type guards on `_mm_text` and each image path
- `image_token` override must be a registered special token
- Error messages log only basenames; full paths at DEBUG only

Label masking: image-family token ids (placeholder + wrappers like
`<|vision_start|>`, `<end_of_image>`) auto-masked to -100. Without
this, loss is ~10× higher empirically and training diverges — the
model is forced to predict visual-patch token ids that don't
correspond to predictable text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multimodal CPT eval was a known gap in the v1 commit — `test_datasets`
entries were handed to the SFT loader, which doesn't register
`multimodal_pretrain`, and with `skip_prepare_dataset: true` (auto-set
for multimodal configs) it returned raw rows to the trainer. The model
forward then hit "ValueError: You must specify exactly one of input_ids
or inputs_embeds" on the first eval step because no tokenization had
run.

The v1 commit papered over this by guarding the MM CPT collator with
`not is_eval`, which delayed the crash to a more cryptic location in
the torch_call mismatch rather than fixing the root cause.

This commit wires the eval path:

1. `utils/data/sft.py`: in `_prepare_streaming_dataset`, detect
   `type: multimodal_pretrain` (or `multimodal: true`) on
   `test_datasets[0]` and route through `_load_streaming_dataset` —
   the same iterable path used for the training pretraining_dataset —
   so eval rows carry `input_ids`/`labels`/`attention_mask`/`images`/
   `_mm_text`, exactly what MultiModalPretrainDataCollator expects.
   Non-MM test_datasets still go through `_load_and_prepare_datasets`.
   Factored the DictDefault-building out into
   `_pretraining_config_from_entry` so train and eval produce
   identically-shaped configs.

2. `core/builders/causal.py`: drop the `not is_eval` guards in
   `build_collator` (both pretraining and non-pretraining branches)
   now that eval rows carry the required columns. Updated the stale
   comment that called out the limitation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- wrap_streaming_dataset now accepts the resolved pretraining_config and
  prefers it over cfg.pretraining_dataset[0], so test_datasets eval no
  longer silently inherits the training entry's columns/image_token.
- encode_streaming_multimodal and MultimodalPretrainTokenizationStrategy
  tokenize without truncation, count placeholders against full ids, and
  raise when a row exceeds sequence_len. Removes a silent corruption
  path where truncation chopped placeholders or oversize batches were
  only flagged by a post-hoc warning.
- Trim docstrings/comments across the MM-CPT codepath.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the gaps that made multimodal eval mostly unreachable through
validated configs and that crashed legitimate batch shapes:

- Add MultiModalEvalDataset and place it first in the test_datasets union
  so MM-marked eval entries preserve text_column / image_column /
  image_base_dir / image_token through validate_config (was silently
  coerced to SFTDataset).
- Iterate every MM test_datasets entry in _prepare_streaming_dataset and
  concatenate the streams; reject mixed MM/non-MM eval lists loudly.
- Validate that all MM eval entries share image_base_dir and image_token
  (or have them unset), since the collator resolves both once.
- Thread is_eval into _build_mm_pretrain_collator so eval images resolve
  against test_datasets[0] instead of pretraining_dataset[0].
- Make _create_placeholder_dataset emit the configured image column as
  [] for MM CPT so dispatch_batches=true workers stop KeyError-ing.
- Add a tokenizer-only fallback in MultiModalPretrainDataCollator for
  all-text batches; mixed-row batches continue through the processor.
- Reject falsy-but-non-None image cells (e.g. "") in
  MultimodalPretrainTokenizationStrategy instead of coercing to [].
- Log an INFO record when remove_unused_columns is auto-set to false.
- Document the eval contract (per-entry text/image columns; shared
  image_base_dir/image_token) in docs/multimodal.qmd.

Tests: 21 new regression tests across the four touched suites covering
schema preservation, multi-entry eval merge, eval collator config
source, placeholder shape, all-text + mixed batches, falsy images
rejection, eval homogeneity validation, and the auto-set log record.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thad0ctor

Copy link
Copy Markdown
Owner Author

Closing to re-open with a squashed branch.

@thad0ctor thad0ctor closed this Apr 25, 2026
thad0ctor added a commit that referenced this pull request May 12, 2026
Five test-quality refinements from CodeRabbit's third-round review.

**R3-#2 — deterministic teardown in test_dora.**

Wrap the DoRA smoke's wrap → train → assert sequence in
``try/finally`` so ``wrapped.close()`` runs even when the
loss-descent assertion fails mid-test. Without this, an early
assertion failure leaves hooks, pinned-host borrows, and CPU
adapter threads alive into subsequent GPU tests on the same
pytest session.

**R3-#3 — distinguish hook edges in test_lora_offload_mode
recording stub.**

The pre-fix ``_RecordingScheduler.ensure_chunks_resident``
recorded every container callback under the same
``"ensure_chunks_resident"`` label. The per-hook tests
(pre_forward / post_forward / post_backward fires
``ensure_chunks_resident``) then asserted only call COUNT — so a
regression that deleted the pre-forward hook factory while
post-forward still fired would still pass the count gates.

Tag each call with its originating hook edge via frame
inspection on the caller's ``co_qualname`` (Python 3.11+
guarantees the qualname captures the enclosing
``_make_lora_container_<edge>_hook`` factory). The four LoRA
container hooks all funnel through the same
``ensure_chunks_resident`` entry point but their closures live
in distinct factory functions, so the qualname uniquely
identifies the edge.

Update each per-hook test to filter on the edge-tagged label so
a regression in any single edge fails the corresponding test:

* pre_forward test: asserts ``ensure_chunks_resident:pre_forward``
  fires ≥ n_blocks times.
* post_forward test: asserts BOTH ``:pre_forward`` AND
  ``:post_forward`` fire ≥ n_containers times each (the previous
  bare ≥ 2*n_containers count was satisfied by either edge alone).
* post_backward test: asserts all four edges (pre/post fwd, pre/
  post bwd) fire ≥ n_containers times each.

The production hook factory layout is unchanged — the stub
recovers the edge from the existing closure's frame, no new
arguments thread through ``install_hooks``.

**R3-#4 — narrow protrain_model_wrapper exception scope in
test_lora_offload_mode:1117.**

The bare ``except (ValueError, RuntimeError)`` was treating any
wrapper failure as "offload setup unavailable" and skipping. A
broken ``protrain_model_wrapper`` runtime path could leave this
smoke green. Restrict the suppression to known env-failure
substrings (DeepSpeedCPUAdam JIT, CUDA version mismatch, bnb
load, ``No module named``, and capacity/searcher gates) — same
canonical tuple D8 used at the optimizer-step site below — and
re-raise anything else. Real wrapper regressions now surface.

**R3-#5 — fail-safe CUDA teardown in
test_param_data_shape_preservation.**

Eight test functions in this module construct ``mgr / layout /
pool / host`` via ``_build_chunk_manager`` and tear them down at
the happy-path tail (``mgr.uninstall()`` / ``host.close()`` /
``del pool``). Any earlier assertion failure skipped the
teardown, leaking pinned-host borrows + CUDA buffer-pool state
into subsequent GPU tests.

Add a top-level ``_teardown_chunk_manager(mgr, host, pool)``
helper that does the best-effort 3-call teardown (each call
wrapped in its own try/except so a failure in ``uninstall``
doesn't block the ``host.close``), and wrap each test body in
``try: ... finally: _teardown_chunk_manager(...)``. Done
programmatically across all 8 tests via a one-shot Python
rewrite to keep the diff mechanical and the new structure
consistent.

**R3-#8 — replace hard-coded n_chunk_estimate=1 in
test_trace_skip_on_override.**

The trace-skip e2e test hard-coded ``n_chunk_estimate = 1`` based
on the assumption that the tiny GPT-2 fixture produces a single
chunk. If the layout heuristics (``pick_S_chunk`` default,
block-discovery rules) shift such that ``N_chunk > 1``,
``min_n_buffer_for(layout, n_persist=1)`` rejects
``n_buffer_override=0`` BEFORE the wrapper reaches the
trace-skip gate the test is supposed to validate — converting
this into a flaky non-target failure.

Compute ``n_chunk_estimate`` dynamically by running the same
``discover_blocks`` → ``flatten_block_trees`` → ``build_layout``
pipeline the wrapper itself uses (with the wrapper's default
S_chunk), and pass the resulting ``layout.N_chunk`` through.
``n_persist_override = n_chunk_estimate`` then keeps the
all-persistent invariant the test relies on regardless of any
future layout-heuristic shift.

``tests/protrain/`` default-marker sweep: 303 passed / 4 skipped
/ 0 failed. GPU-marker sweep on touched files: 40 passed /
2 skipped (single-process Mode-C downgrade for shape-preserving
placeholder paths) / 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thad0ctor added a commit that referenced this pull request May 12, 2026
…est fixes

Seven Minor items from the CodeRabbit full-diff re-scan on
commit ``55377e5d``.

**F-#2 — Clarify Mode-A guidance in ``protrain_optimizer_wrapper``
8-bit warning (``api/optim_wrapper.py:802-815``).**

The warning told users to set ``protrain_force_all_persistent: true``
to get end-to-end 8-bit AdamW on CPU-resident chunks, but didn't
mention that ``protrain_force_all_persistent`` is ignored while
``protrain_auto_mode`` is on (the auto-mode selector picks the mode
itself based on capacity). Expanded the warning to instruct users
to set ``protrain_auto_mode: false`` AND
``protrain_force_all_persistent: true`` together.

**F-#4 — Unify fragmentation-alpha docs in DESIGN.md.**

Module summaries at lines 49 (``cost/memory.py``) and 118
(``memory.py`` module spec) still described a fixed ``alpha=1.10``
while Design Decision 1 documents the per-dtype lookup
(``ALPHA_FRAGMENTATION_4BIT = 0.75`` for bnb-4-bit). Aligned both
summaries to reference the per-dtype helper
(``alpha_fragmentation_for_dtype``) and the design decision section.

**F-#5 — Resolve ``use_reentrant`` contradiction in DESIGN.md.**

Line 109 (``block/checkpoint.py`` module spec) said
``use_reentrant=False``, which matches the actual implementation
(verified via ``grep`` against ``block/checkpoint.py:99``). Line 290
(audit Block G analysis) claimed ``use_reentrant=True, the
production wrap`` — stale and incorrect. Updated the analysis text
to acknowledge ``use_reentrant=False`` is the production wrap and
re-stated the per-block-input residual mechanism in a form
compatible with the non-reentrant variant (each CKPT block's
saved-tensors-hooks recompute frame holds the block input, which
is what produces the linear-in-N_block activation footprint the
audit data exposes).

**F-#8 — Centralized CUDA-availability guard in
``tests/protrain/test_adamw8bit_adapter.py::_gpu_device``.**

The helper unconditionally returned ``torch.device("cuda:0")``,
so a custom marker filter or conftest override that lands the
module in a CPU-only context would surface as a torch error
before any test body. Added a
``pytest.skip("CUDA not available; ...")`` early-return so every
gpu-marked test in the module gets a clean skip.

**F-#9 — Replace silent ``try/except: pass`` with
``contextlib.suppress(Exception)`` in
``tests/protrain/test_lora_offload_mode.py``.**

Five sites — lines 742-746, 839-843, 906-910, 981-985, 1040-1044
— each had the same ``for h in handles: try: h.remove() except
Exception: pass`` pattern that Ruff S110 flags. Replaced with
``contextlib.suppress(Exception)`` over the loop. Semantics
unchanged (best-effort cleanup, tolerate already-removed handles
or torch shutting down mid-test); intent now documented by the
context manager.

**F-#10 — ASCII ``x`` in ``test_lora_offload_mode.py:1062`` docstring.**

Missed in the R5 unicode sweep — ``4×3090`` ⇒ ``4x3090``.

**F-#11 — ``try/finally`` for ``wrapped.close()`` in 3 sites of
``test_trace_skip_on_override.py``.**

``test_run_trace_skipped_on_override_full_path`` (L255-282),
``test_run_trace_invoked_without_override`` (L319-337), and
``test_partial_overrides_do_not_skip_trace`` (L381-400) each
called ``wrapped.close()`` only on the success path — assertion
failures earlier in the test body would skip the close and leak
CUDA + chunk resources into subsequent GPU tests. Wrapped each
test body in ``try/finally`` so ``wrapped.close()`` always
runs. Done programmatically via a one-shot Python rewrite
(8 lines of new indent + 2 lines of try/finally per site) to
keep the diff mechanical.

### Test gates

- ``pre-commit run --all-files`` ALL green (ruff / ruff-format /
  mypy / bandit / yaml / eol / whitespace).
- ``tests/protrain/`` default-marker: 313 passed / 4 skipped /
  162 deselected / 0 failed.
- GPU sanity on F-touched files (GPU 5): 43 passed / 2 skipped /
  0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thad0ctor added a commit that referenced this pull request May 23, 2026
Fixes pre-commit failures on CI after the ARCH #8/#9/#10 commits:
ruff-format auto-format on 8 files (line-wrap of comprehensions and
MagicMock(spec=...) calls; alphabetize one multi-import block;
strip a trailing blank line in a test header) and add the missing
`Any` symbol that `cast("Any", ...)` in test_modec_persistent_partition.py
referenced without import.
thad0ctor added a commit that referenced this pull request May 28, 2026
Five test-quality refinements from CodeRabbit's third-round review.

**R3-#2 — deterministic teardown in test_dora.**

Wrap the DoRA smoke's wrap → train → assert sequence in
``try/finally`` so ``wrapped.close()`` runs even when the
loss-descent assertion fails mid-test. Without this, an early
assertion failure leaves hooks, pinned-host borrows, and CPU
adapter threads alive into subsequent GPU tests on the same
pytest session.

**R3-#3 — distinguish hook edges in test_lora_offload_mode
recording stub.**

The pre-fix ``_RecordingScheduler.ensure_chunks_resident``
recorded every container callback under the same
``"ensure_chunks_resident"`` label. The per-hook tests
(pre_forward / post_forward / post_backward fires
``ensure_chunks_resident``) then asserted only call COUNT — so a
regression that deleted the pre-forward hook factory while
post-forward still fired would still pass the count gates.

Tag each call with its originating hook edge via frame
inspection on the caller's ``co_qualname`` (Python 3.11+
guarantees the qualname captures the enclosing
``_make_lora_container_<edge>_hook`` factory). The four LoRA
container hooks all funnel through the same
``ensure_chunks_resident`` entry point but their closures live
in distinct factory functions, so the qualname uniquely
identifies the edge.

Update each per-hook test to filter on the edge-tagged label so
a regression in any single edge fails the corresponding test:

* pre_forward test: asserts ``ensure_chunks_resident:pre_forward``
  fires ≥ n_blocks times.
* post_forward test: asserts BOTH ``:pre_forward`` AND
  ``:post_forward`` fire ≥ n_containers times each (the previous
  bare ≥ 2*n_containers count was satisfied by either edge alone).
* post_backward test: asserts all four edges (pre/post fwd, pre/
  post bwd) fire ≥ n_containers times each.

The production hook factory layout is unchanged — the stub
recovers the edge from the existing closure's frame, no new
arguments thread through ``install_hooks``.

**R3-#4 — narrow protrain_model_wrapper exception scope in
test_lora_offload_mode:1117.**

The bare ``except (ValueError, RuntimeError)`` was treating any
wrapper failure as "offload setup unavailable" and skipping. A
broken ``protrain_model_wrapper`` runtime path could leave this
smoke green. Restrict the suppression to known env-failure
substrings (DeepSpeedCPUAdam JIT, CUDA version mismatch, bnb
load, ``No module named``, and capacity/searcher gates) — same
canonical tuple D8 used at the optimizer-step site below — and
re-raise anything else. Real wrapper regressions now surface.

**R3-#5 — fail-safe CUDA teardown in
test_param_data_shape_preservation.**

Eight test functions in this module construct ``mgr / layout /
pool / host`` via ``_build_chunk_manager`` and tear them down at
the happy-path tail (``mgr.uninstall()`` / ``host.close()`` /
``del pool``). Any earlier assertion failure skipped the
teardown, leaking pinned-host borrows + CUDA buffer-pool state
into subsequent GPU tests.

Add a top-level ``_teardown_chunk_manager(mgr, host, pool)``
helper that does the best-effort 3-call teardown (each call
wrapped in its own try/except so a failure in ``uninstall``
doesn't block the ``host.close``), and wrap each test body in
``try: ... finally: _teardown_chunk_manager(...)``. Done
programmatically across all 8 tests via a one-shot Python
rewrite to keep the diff mechanical and the new structure
consistent.

**R3-#8 — replace hard-coded n_chunk_estimate=1 in
test_trace_skip_on_override.**

The trace-skip e2e test hard-coded ``n_chunk_estimate = 1`` based
on the assumption that the tiny GPT-2 fixture produces a single
chunk. If the layout heuristics (``pick_S_chunk`` default,
block-discovery rules) shift such that ``N_chunk > 1``,
``min_n_buffer_for(layout, n_persist=1)`` rejects
``n_buffer_override=0`` BEFORE the wrapper reaches the
trace-skip gate the test is supposed to validate — converting
this into a flaky non-target failure.

Compute ``n_chunk_estimate`` dynamically by running the same
``discover_blocks`` → ``flatten_block_trees`` → ``build_layout``
pipeline the wrapper itself uses (with the wrapper's default
S_chunk), and pass the resulting ``layout.N_chunk`` through.
``n_persist_override = n_chunk_estimate`` then keeps the
all-persistent invariant the test relies on regardless of any
future layout-heuristic shift.

``tests/protrain/`` default-marker sweep: 303 passed / 4 skipped
/ 0 failed. GPU-marker sweep on touched files: 40 passed /
2 skipped (single-process Mode-C downgrade for shape-preserving
placeholder paths) / 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thad0ctor added a commit that referenced this pull request May 28, 2026
…est fixes

Seven Minor items from the CodeRabbit full-diff re-scan on
commit ``55377e5d``.

**F-#2 — Clarify Mode-A guidance in ``protrain_optimizer_wrapper``
8-bit warning (``api/optim_wrapper.py:802-815``).**

The warning told users to set ``protrain_force_all_persistent: true``
to get end-to-end 8-bit AdamW on CPU-resident chunks, but didn't
mention that ``protrain_force_all_persistent`` is ignored while
``protrain_auto_mode`` is on (the auto-mode selector picks the mode
itself based on capacity). Expanded the warning to instruct users
to set ``protrain_auto_mode: false`` AND
``protrain_force_all_persistent: true`` together.

**F-#4 — Unify fragmentation-alpha docs in DESIGN.md.**

Module summaries at lines 49 (``cost/memory.py``) and 118
(``memory.py`` module spec) still described a fixed ``alpha=1.10``
while Design Decision 1 documents the per-dtype lookup
(``ALPHA_FRAGMENTATION_4BIT = 0.75`` for bnb-4-bit). Aligned both
summaries to reference the per-dtype helper
(``alpha_fragmentation_for_dtype``) and the design decision section.

**F-#5 — Resolve ``use_reentrant`` contradiction in DESIGN.md.**

Line 109 (``block/checkpoint.py`` module spec) said
``use_reentrant=False``, which matches the actual implementation
(verified via ``grep`` against ``block/checkpoint.py:99``). Line 290
(audit Block G analysis) claimed ``use_reentrant=True, the
production wrap`` — stale and incorrect. Updated the analysis text
to acknowledge ``use_reentrant=False`` is the production wrap and
re-stated the per-block-input residual mechanism in a form
compatible with the non-reentrant variant (each CKPT block's
saved-tensors-hooks recompute frame holds the block input, which
is what produces the linear-in-N_block activation footprint the
audit data exposes).

**F-#8 — Centralized CUDA-availability guard in
``tests/protrain/test_adamw8bit_adapter.py::_gpu_device``.**

The helper unconditionally returned ``torch.device("cuda:0")``,
so a custom marker filter or conftest override that lands the
module in a CPU-only context would surface as a torch error
before any test body. Added a
``pytest.skip("CUDA not available; ...")`` early-return so every
gpu-marked test in the module gets a clean skip.

**F-#9 — Replace silent ``try/except: pass`` with
``contextlib.suppress(Exception)`` in
``tests/protrain/test_lora_offload_mode.py``.**

Five sites — lines 742-746, 839-843, 906-910, 981-985, 1040-1044
— each had the same ``for h in handles: try: h.remove() except
Exception: pass`` pattern that Ruff S110 flags. Replaced with
``contextlib.suppress(Exception)`` over the loop. Semantics
unchanged (best-effort cleanup, tolerate already-removed handles
or torch shutting down mid-test); intent now documented by the
context manager.

**F-#10 — ASCII ``x`` in ``test_lora_offload_mode.py:1062`` docstring.**

Missed in the R5 unicode sweep — ``4×3090`` ⇒ ``4x3090``.

**F-#11 — ``try/finally`` for ``wrapped.close()`` in 3 sites of
``test_trace_skip_on_override.py``.**

``test_run_trace_skipped_on_override_full_path`` (L255-282),
``test_run_trace_invoked_without_override`` (L319-337), and
``test_partial_overrides_do_not_skip_trace`` (L381-400) each
called ``wrapped.close()`` only on the success path — assertion
failures earlier in the test body would skip the close and leak
CUDA + chunk resources into subsequent GPU tests. Wrapped each
test body in ``try/finally`` so ``wrapped.close()`` always
runs. Done programmatically via a one-shot Python rewrite
(8 lines of new indent + 2 lines of try/finally per site) to
keep the diff mechanical.

### Test gates

- ``pre-commit run --all-files`` ALL green (ruff / ruff-format /
  mypy / bandit / yaml / eol / whitespace).
- ``tests/protrain/`` default-marker: 313 passed / 4 skipped /
  162 deselected / 0 failed.
- GPU sanity on F-touched files (GPU 5): 43 passed / 2 skipped /
  0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thad0ctor added a commit that referenced this pull request May 28, 2026
Fixes pre-commit failures on CI after the ARCH #8/#9/#10 commits:
ruff-format auto-format on 8 files (line-wrap of comprehensions and
MagicMock(spec=...) calls; alphabetize one multi-import block;
strip a trailing blank line in a test header) and add the missing
`Any` symbol that `cast("Any", ...)` in test_modec_persistent_partition.py
referenced without import.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant