Skip to content

Conversation

@LakshmiKalaKadali
Copy link

This PR introduced conversion of KerasHub QwenBackbone models to the Hugging Face transformers SafeTensors format.

Implemented get_qwen_config to construct a validated transformers.Qwen2Config object, ensuring correct mapping of specific hyperparameters like rope_theta and rms_norm_eps.

Registered QwenBackbone and QwenTokenizer in the export registry.

Verified numerical equivalence of logits between KerasHub and Hugging Face implementations with a strict tolerance.

This notebook vaidated the conversion from Keras Qwen model to a Hugging Face (HF) compatible format

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @LakshmiKalaKadali, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the model export capabilities by introducing comprehensive support for converting KerasHub QwenBackbone models to the Hugging Face SafeTensors format. It includes detailed configuration mapping, weight conversion logic, and registration within the existing export utility. Additionally, it extends similar export functionality to GPT-2 models, ensuring broader compatibility and interoperability between KerasHub and Hugging Face ecosystems. The changes are validated through rigorous numerical equivalence tests, confirming the fidelity of the conversions.

Highlights

  • Qwen Model Conversion: Added comprehensive support for converting KerasHub QwenBackbone models to the Hugging Face transformers SafeTensors format, including detailed configuration and weight mapping.
  • GPT-2 Model Conversion: Introduced similar export utilities for KerasHub GPT-2 models, enabling their conversion to Hugging Face SafeTensors format.
  • Configuration and Weight Mapping: Implemented get_qwen_config and get_qwen_weights_map to accurately map Qwen-specific hyperparameters and layer weights (e.g., embeddings, attention, MLP, normalization) to their Hugging Face equivalents. Similar functions were added for GPT-2.
  • Export Registry Integration: Registered QwenBackbone, QwenTokenizer, GPT2Backbone, and GPT2Tokenizer within the hf_exporter to streamline the export process.
  • Exporter Enhancements: Modified the hf_exporter to correctly handle transformers.Config objects by serializing them to dictionaries and to manage tied weights for GPT-2 and Qwen models, preventing safetensors crashes.
  • Tokenizer File Renaming: Updated the tokenizer export logic to rename vocabulary files (e.g., vocabulary.json to vocab.json) to align with Hugging Face's expected file structure for GPT-2 and Qwen tokenizers.
  • Numerical Equivalence Verification: Verified the numerical equivalence of logits between the KerasHub and Hugging Face implementations for both Qwen and GPT-2 models, ensuring conversion fidelity with strict tolerance.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for converting KerasHub Qwen and GPT-2 models to the Hugging Face SafeTensors format. The implementation is well-structured, with dedicated modules for each model's conversion logic and corresponding tests. The main exporter has been thoughtfully updated to handle various tensor types, config objects, and tied weights. My feedback focuses on improving docstring coverage to align with the repository's style guide and correcting a minor typo in a test case.

class QwenExportTest(TestCase):
@parameterized.named_parameters(
# Use a small preset for testing
("qwen2_0.5b_en", "qwen2.5_0.5b_en"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There appears to be a typo in the preset name used for testing. The test case is named qwen2_0.5b_en, but the preset passed is qwen2.5_0.5b_en. qwen2.5 is not a standard model version. Please correct the preset name to match the intended model, which is likely qwen2_0.5b_en.

Suggested change
("qwen2_0.5b_en", "qwen2.5_0.5b_en"),
("qwen2_0.5b_en", "qwen2_0.5b_en"),

Comment on lines 5 to 6
def get_gpt2_config(keras_model):
"""Convert Keras GPT-2 config to Hugging Face GPT2Config."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docstrings for get_gpt2_config, get_gpt2_weights_map, and get_gpt2_tokenizer_config are minimal. Per the repository style guide (lines 367-370), all public functions should have Google-style docstrings that document all parameters and return values. Please expand them to include Args: and Returns: sections for clarity and consistency. For example:

def get_gpt2_config(keras_model):
    """Convert Keras GPT-2 config to Hugging Face GPT2Config.

    Args:
        keras_model: A KerasHub GPT-2 model instance.

    Returns:
        A `transformers.GPT2Config` object.
    """
References
  1. All public classes, methods, and functions should use Google-style docstrings, with the first line being a concise summary, and including comprehensive documentation for all parameters, return values, and exceptions. (link)

Comment on lines +5 to +6
def get_qwen_config(backbone):
"""Convert Keras Qwen config to Hugging Face Qwen2Config."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docstrings for get_qwen_config, get_qwen_weights_map, and get_qwen_tokenizer_config are minimal. The repository style guide (lines 367-370) requires comprehensive Google-style docstrings for all public functions, including Args: and Returns: sections. Please add these details to improve documentation and adhere to the contribution guidelines. For example:

def get_qwen_config(backbone):
    """Convert Keras Qwen config to Hugging Face Qwen2Config.

    Args:
        backbone: A KerasHub QwenBackbone instance.

    Returns:
        A `transformers.Qwen2Config` object.
    """
References
  1. All public classes, methods, and functions should use Google-style docstrings, with the first line being a concise summary, and including comprehensive documentation for all parameters, return values, and exceptions. (link)

@LakshmiKalaKadali LakshmiKalaKadali force-pushed the qwen_safe_tensors branch 2 times, most recently from 8c982e0 to e18f944 Compare December 16, 2025 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant