Skip to content

update model_config for granite 4 models#821

Merged
joerunde merged 2 commits intotorch-spyre:mainfrom
tjohnson31415:granite-4-dense
Mar 11, 2026
Merged

update model_config for granite 4 models#821
joerunde merged 2 commits intotorch-spyre:mainfrom
tjohnson31415:granite-4-dense

Conversation

@tjohnson31415
Copy link
Copy Markdown
Collaborator

@tjohnson31415 tjohnson31415 commented Mar 10, 2026

Description

Updates model configuration for Granite 4 dense models including for granite variant (instead of granitemoehybrid).

Related Issues

Test Plan

Checklist

  • I have read the contributing guidelines
  • My code follows the project's code style (run bash format.sh)
  • I have added tests for my changes (if applicable)
  • I have updated the documentation (if applicable)
  • My commits include a Signed-off-by: line (DCO compliance)

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
@github-actions
Copy link
Copy Markdown

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, run ./format.sh.
Now you are good to go 🚀.

We also recommend installing prek and configuring it to check your code before every local commit.

Signed-off-by: Joe Runde <joe@joerun.de>
@joerunde joerunde marked this pull request as ready for review March 10, 2026 22:36
{
"architectures": [
"GraniteMoeHybridForCausalLM"
"GraniteForCausalLM"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the diff between the old and new checkpoint configs

"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.56.0",
"transformers_version": "4.53.3",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would different transformers version cause any issues between the 2 variants?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the answer :/

It's probably not too relevant for us as we're not using transformers to load the model

Comment on lines +66 to +67
# This is really a dense model, but it has model type "granitemoehybrid"
# It has the same overrides as the regular dense variant
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me think that there might have be a mistake when creating the new checkpoint 🤔

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's kinda unclear how things evolved here, probably part of the reason why these configs haven't quite yet landed on hf hub

Copy link
Copy Markdown
Collaborator

@joerunde joerunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed this loads new checkpoints correctly

@joerunde joerunde merged commit 906db8a into torch-spyre:main Mar 11, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants