Feature/add smolvlm2 #203

AshAnand34 · 2025-05-10T19:46:02Z

Description

This pull request introduces support for the new SmolVLM2 model, a lightweight vision-language model (see #197). It includes updates to the documentation, CLI, core model implementation, and additional utilities for training, inference, and object detection. Below is a summary of the most important changes grouped by theme.

CLI Enhancements

Registered SmolVLM2 commands (info, predict, and train) in the CLI via maestro/cli/introspection.py and maestro/trainer/models/smolvlm2/entrypoint.py. These commands enable fine-tuning, inference, and model information retrieval directly from the command line.

Core Model Implementation

Added maestro/trainer/models/smolvlm2/core.py, which includes the SmolVLM2Core class for model initialization, input processing, text generation, and training. It supports optimization strategies like QLoRA, LoRA, and freezing the vision encoder.

Utility Functions

Introduced checkpoint utilities in maestro/trainer/models/smolvlm2/checkpoints.py for saving and loading model checkpoints, including metadata.
Added maestro/trainer/models/smolvlm2/detection.py for converting SmolVLM2 text outputs into object detection formats and vice versa, as well as formatting prompts for detection tasks.

Inference and Entrypoint

Implemented SmolVLM2Inference and integrated it into the main entrypoint in maestro/trainer/models/smolvlm2/entrypoint.py, enabling flexible inference workflows via both CLI and Python.

List any dependencies that are required for this change.

"accelerate>=1.2.1",
"peft>=0.12",
"torch>=2.4.0",
"torchvision>=0.20.0",
"transformers>=4.49.0",
"bitsandbytes>=0.45.0"

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)
This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

Testing in progress

Docs

Docs updated? What were the changes:

Added SmolVLM2-specific installation, training (CLI and Python), and inference instructions to docs/index.md.
Created a dedicated docs/models/smolvlm2.md file with an overview, installation steps, training options, inference examples, and object detection capabilities.

CLAassistant · 2025-05-10T19:46:08Z

All committers have signed the CLA.

…ttps://github.com/AshAnand34/maestro into feature/add-smolvlm2

bonninr · 2025-05-16T00:37:49Z

Voting +1 for this feature being reviewed.

SkalskiP · 2025-06-06T12:06:40Z

Hi @AshAnand34, thanks a lot for this PR! SmolVLM and SmolVLM2 have been on our radar for a while now. However, after a deeper review, we realized that your PR doesn’t fully align with the conventions we follow in the maestro repository. For this reason, I asked @AlexBodner to build on top of your code and make the necessary adjustments.

As a result, I’m closing this PR. You can find the updated version here: #207. It includes both your commits and those from @AlexBodner, so your contribution history remains intact.

AshAnand34 added 3 commits May 10, 2025 01:24

Created SmolVLM2 model in maestro

4394a90

Fixing lint errors and crated trainer for training dataset in smolvlm2

0518c67

SmolVLM2 documented

727e01b

pre-commit-ci bot and others added 7 commits May 10, 2025 19:46

fix(pre_commit): 🎨 auto format pre-commit hooks

d074602

fixing errors in smolvlm2 interpretation

6c35bef

Merge branches 'feature/add-smolvlm2' and 'feature/add-smolvlm2' of h…

7b49423

…ttps://github.com/AshAnand34/maestro into feature/add-smolvlm2

fix(pre_commit): 🎨 auto format pre-commit hooks

8fe68f2

Fixing more errors with core.py

4ff3d63

fix(pre_commit): 🎨 auto format pre-commit hooks

0e1804e

Fixed Ruff error with too long line

3ea5544

SkalskiP closed this Jun 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/add smolvlm2 #203

Feature/add smolvlm2 #203

Uh oh!

AshAnand34 commented May 10, 2025 •

edited

Loading

Uh oh!

CLAassistant commented May 10, 2025 •

edited

Loading

Uh oh!

bonninr commented May 16, 2025

Uh oh!

SkalskiP commented Jun 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Feature/add smolvlm2 #203

Feature/add smolvlm2 #203

Uh oh!

Conversation

AshAnand34 commented May 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

CLI Enhancements

Core Model Implementation

Utility Functions

Inference and Entrypoint

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Docs

Uh oh!

CLAassistant commented May 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bonninr commented May 16, 2025

Uh oh!

SkalskiP commented Jun 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AshAnand34 commented May 10, 2025 •

edited

Loading

CLAassistant commented May 10, 2025 •

edited

Loading