Add sequence classification capability to Granite models#44215
Add sequence classification capability to Granite models#44215jmriosal wants to merge 8 commits intohuggingface:mainfrom
Conversation
Created ForSequenceClassification classes for Granite, GraniteMoe, GraniteMoeHybrid, GraniteMoeShared using the existing GenericForSequenceClassification mixin pattern. Implementation in modular_*.py Updated __all__ exports in each model module Registered all new classes in auto/modeling_auto.py
|
working on adding new tests for the new |
…ng_*.py files, following the same pattern as other models in the library
|
The CI check_repository_consistency is failing due to a pre-existing mismatch in PR #44176 between modular and modeling files for GraniteMoeHybrid, in which As a consequence, the following check FAILS: python utils/check_modular_conversion.py --files src/transformers/models/granitemoehybrid/modular_granitemoehybrid.pyAny advise? |
Added prepare_config_and_inputs_for_sequence_classification() method to provide the correct input format for create_and_check_for_sequence_classification
…teMoeHybridModelTester available from BambaModelTester.
…ModelTester (inherited by GraniteMoeHybridModelTester)
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, granite, granitemoe, granitemoehybrid, granitemoeshared |
ArthurZucker
left a comment
There was a problem hiding this comment.
LGTM but you can also use GenericForSequenceClassification directly!
Kind of up to you but its generic so compatible!
|
don't worry we'll fix this one on main! |
What does this PR do?
Add sequence classification capabilities to the family of Granite models (Granite, GraniteMoe, GraniteMoeHybrid, and GraniteMoeShared).
Fixes #44214, #35720
Why
The Granite models currently only have the base model and causal model heads, so this addition brings them more in line with other models in the library.
Proposed solution and description of changes
The following
ForSequenceClassificationclasses were added:GraniteForSequenceClassificationGraniteMoeForSequenceClassificationGraniteMoeHybridForSequenceClassificationGraniteMoeSharedForSequenceClassificationusing the existing
GenericForSequenceClassification, following the established pattern seen in many other models in the library. Code changes were minimal and done in a way to keep consistent logic across similar models. Changes were implemented inmodular_*.pyand thenmodeling_*.pyfiles were automatically generated usingutils/modular_model_converter.pyUpdated
__all__exports with new classes in each model module.The Auto Model Registry (
MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING_NAMES) has been updated to allow for auto-loading them via AutoModelForSequenceClassificationNew features usage
After with PR, users should be able to load any Granite model variant for sequence classification as follows:
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed.
@ArthurZucker @Cyrilvallez