-
Notifications
You must be signed in to change notification settings - Fork 313
Add SafeTensors Conversion Support for Qwen Models #2471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add SafeTensors Conversion Support for Qwen Models #2471
Conversation
Summary of ChangesHello @LakshmiKalaKadali, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly expands the model export capabilities by introducing comprehensive support for converting KerasHub QwenBackbone models to the Hugging Face SafeTensors format. It includes detailed configuration mapping, weight conversion logic, and registration within the existing export utility. Additionally, it extends similar export functionality to GPT-2 models, ensuring broader compatibility and interoperability between KerasHub and Hugging Face ecosystems. The changes are validated through rigorous numerical equivalence tests, confirming the fidelity of the conversions. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces support for converting KerasHub Qwen and GPT-2 models to the Hugging Face SafeTensors format. The implementation is well-structured, with dedicated modules for each model's conversion logic and corresponding tests. The main exporter has been thoughtfully updated to handle various tensor types, config objects, and tied weights. My feedback focuses on improving docstring coverage to align with the repository's style guide and correcting a minor typo in a test case.
| class QwenExportTest(TestCase): | ||
| @parameterized.named_parameters( | ||
| # Use a small preset for testing | ||
| ("qwen2_0.5b_en", "qwen2.5_0.5b_en"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a typo in the preset name used for testing. The test case is named qwen2_0.5b_en, but the preset passed is qwen2.5_0.5b_en. qwen2.5 is not a standard model version. Please correct the preset name to match the intended model, which is likely qwen2_0.5b_en.
| ("qwen2_0.5b_en", "qwen2.5_0.5b_en"), | |
| ("qwen2_0.5b_en", "qwen2_0.5b_en"), |
| def get_gpt2_config(keras_model): | ||
| """Convert Keras GPT-2 config to Hugging Face GPT2Config.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstrings for get_gpt2_config, get_gpt2_weights_map, and get_gpt2_tokenizer_config are minimal. Per the repository style guide (lines 367-370), all public functions should have Google-style docstrings that document all parameters and return values. Please expand them to include Args: and Returns: sections for clarity and consistency. For example:
def get_gpt2_config(keras_model):
"""Convert Keras GPT-2 config to Hugging Face GPT2Config.
Args:
keras_model: A KerasHub GPT-2 model instance.
Returns:
A `transformers.GPT2Config` object.
"""References
- All public classes, methods, and functions should use Google-style docstrings, with the first line being a concise summary, and including comprehensive documentation for all parameters, return values, and exceptions. (link)
| def get_qwen_config(backbone): | ||
| """Convert Keras Qwen config to Hugging Face Qwen2Config.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstrings for get_qwen_config, get_qwen_weights_map, and get_qwen_tokenizer_config are minimal. The repository style guide (lines 367-370) requires comprehensive Google-style docstrings for all public functions, including Args: and Returns: sections. Please add these details to improve documentation and adhere to the contribution guidelines. For example:
def get_qwen_config(backbone):
"""Convert Keras Qwen config to Hugging Face Qwen2Config.
Args:
backbone: A KerasHub QwenBackbone instance.
Returns:
A `transformers.Qwen2Config` object.
"""References
- All public classes, methods, and functions should use Google-style docstrings, with the first line being a concise summary, and including comprehensive documentation for all parameters, return values, and exceptions. (link)
8c982e0 to
e18f944
Compare
This PR introduced conversion of KerasHub QwenBackbone models to the Hugging Face transformers SafeTensors format.
Implemented
get_qwen_configto construct a validated transformers.Qwen2Config object, ensuring correct mapping of specific hyperparameters like rope_theta and rms_norm_eps.Registered QwenBackbone and QwenTokenizer in the export registry.
Verified numerical equivalence of logits between KerasHub and Hugging Face implementations with a strict tolerance.
This notebook vaidated the conversion from Keras Qwen model to a Hugging Face (HF) compatible format