Fix EagleMistralLarge3Model initialization#37232
Conversation
Signed-off-by: juliendenize <julien.denize@mistral.ai>
There was a problem hiding this comment.
Code Review
This pull request correctly fixes an initialization error in EagleMistralLarge3Model by adding the missing aux_hidden_state_layers attribute. The error arose because the model's __init__ method does not call super().__init__(), making it brittle to changes in its base class, DeepseekV2Model. My review includes a suggestion to add a TODO comment to track this structural issue and encourage a future refactoring to improve maintainability.
| prefix=maybe_prefix(prefix, "fc"), | ||
| ) | ||
| self.norm = RMSNorm(config.hidden_size, eps=config.rms_norm_eps) | ||
| self.aux_hidden_state_layers: tuple[int, ...] = () |
There was a problem hiding this comment.
While this change correctly initializes aux_hidden_state_layers, it highlights a broader maintainability issue. The __init__ method of EagleMistralLarge3Model does not call super().__init__() from its base class DeepseekV2Model, but instead re-implements much of its logic. This is why aux_hidden_state_layers had to be added here manually after it was added to the base class.
To prevent similar issues in the future and to track this technical debt, I suggest adding a TODO comment. This will make it clear that a refactoring is needed to make the class more robust against changes in its parent.
# TODO: Refactor to call super().__init__ from DeepseekV2Model
# to avoid missing future attributes.
self.aux_hidden_state_layers: tuple[int, ...] = ()Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Purpose
This PR fixes initialization of
EagleMistralLarge3Modeldue to #36361 that addedaux_hidden_state_layersinit requirement.Test Plan
ran an inference
Test Result
on main it raises error, now it works :)
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.