-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Remove reference to subclasses in modernbert #40896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove reference to subclasses in modernbert #40896
Conversation
|
cc @Cyrilvallez in case the modular converter can do anything in these cases |
|
Humm, those patterns are indeed discouraged, but it probably means that you need to overwrite the method explicitly in your modular @lematt1991, as otherwise you will still end-up with unnecessary code (those same branches won't import dependencies, but will still exist) 🤔 Could you expand a bit more on your issue before we take a decision on this PR? @Rocketknight1 modular cannot know whether the user intents to add the dependencies or not unfortunately - it simply check if it finds an object that it does not know about and is not redefined in modular, and if it's the case, it imports it as a dependency. For |
@Cyrilvallez thanks for your response! I'm trying to create a new model that has class MyModelModernBertModel(ModernBertModel):
....
class MyModel(PretrainedModel):
def __init__(self, ...):
self.text_encoder = MyModelModernBertModel(...)After running modular_model_converter without this PR, it pulls in all of the other sub-classes (
Sorry, I'm not sure I follow, with this fix we no longer end up with the ( from ..modeling_modernbert import ModernBertModel
class MyModernBertModel(ModernBertModel): ...And then running |
Yes for sure, but then the And doing so would not require any changes to the current ModernBert |
|
Ah I see what you're saying.
They are useless, but they aren't harming anything. I think the problem with the proposed solution is that I need to copy As opposed to the one liner I mentioned here |
|
@Cyrilvallez, one more argument I'd like to make in favor of this change is that it preserves the abstraction of
Fundamentally, what I want to do is import |
|
We never want to add useless code in Transformers, even if harmless hahaha. It makes the code hard to read, and increases a lot the maintenance burden! We put a lot of efforts into trying to have as minimal code as possible. In your case, you can simply do in modular: class NewPreTrainedModel(ModernBertPreTrainedModel)
def _init_weights(self, module):
....
class NewModel(ModernBertModel):
passand it will work out. So you don't have to redefine anything more than what you would have with the change in this PR, only the |
|
Ah you're right that is much simpler, thank you! I do still think this is worth fixing since anyone trying to import |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: modernbert, modernbert_decoder |
|
What I don't like so much about your original change is that we loose the explicitness, i.e. we don't know which classes are impacted by those branches anymore. To circumvent the modular issue, and keep the explicitness, we can use name checks tho, then I'd be in favor of this PR 🤗 |
|
@Cyrilvallez, what do you think about the latest changes? This pushes specific weight initialization into the sub-classes |
|
It completely changes the inheritance graph in a way that we don't do 🙂 |
What does this PR do?
ModernBertPretrainedModelcurrently references it's sub-classes when initializing weights. This breaks things when you try to create a new model that inherits from this class and useutils/modular_model_converter.py, since it will pull in the sub-classes as dependencies, for example:This removes any references to sub-classes inside of
ModernBertPreTrainedModelbreaking the circular reference.Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
CC @ArthurZucker and @tomaarsen who reviewed #35158