Skip to content

Conversation

@lewtun
Copy link
Member

@lewtun lewtun commented Feb 11, 2022

What does this PR do?

This PR addresses an edge case introduced by #13831 where the ONNX export fails if:

  • Both torch and tensorflow are installed in the same environment
  • The user tries to export a pure TensorFlow model (i.e. a model repo without PyTorch weights)

Here is an example that fails to export on the master branch:

python -m transformers.onnx --model=keras-io/transformers-qa onnx/
Traceback
404 Client Error: Entry Not Found for url: https://huggingface.co/keras-io/transformers-qa/resolve/main/pytorch_model.bin
Traceback (most recent call last):
  File "/Users/lewtun/git/transformers/src/transformers/modeling_utils.py", line 1358, in from_pretrained
    resolved_archive_file = cached_path(
  File "/Users/lewtun/git/transformers/src/transformers/file_utils.py", line 1904, in cached_path
    output_path = get_from_cache(
  File "/Users/lewtun/git/transformers/src/transformers/file_utils.py", line 2108, in get_from_cache
    _raise_for_status(r)
  File "/Users/lewtun/git/transformers/src/transformers/file_utils.py", line 2031, in _raise_for_status
    raise EntryNotFoundError(f"404 Client Error: Entry Not Found for url: {request.url}")
transformers.file_utils.EntryNotFoundError: 404 Client Error: Entry Not Found for url: https://huggingface.co/keras-io/transformers-qa/resolve/main/pytorch_model.bin

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/lewtun/miniconda3/envs/transformers/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/lewtun/miniconda3/envs/transformers/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/lewtun/git/transformers/src/transformers/onnx/__main__.py", line 77, in <module>
    main()
  File "/Users/lewtun/git/transformers/src/transformers/onnx/__main__.py", line 51, in main
    model = FeaturesManager.get_model_from_feature(args.feature, args.model)
  File "/Users/lewtun/git/transformers/src/transformers/onnx/features.py", line 307, in get_model_from_feature
    return model_class.from_pretrained(model)
  File "/Users/lewtun/git/transformers/src/transformers/models/auto/auto_factory.py", line 447, in from_pretrained
    return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
  File "/Users/lewtun/git/transformers/src/transformers/modeling_utils.py", line 1394, in from_pretrained
    raise EnvironmentError(
OSError: keras-io/transformers-qa does not appear to have a file named pytorch_model.bin but there is a file for TensorFlow weights. Use `from_tf=True` to load this model from those weights.

The reason this fails is because the FeaturesManager.get_model_class_for_feature() method uses the _TASKS_TO_AUTOMODELS mapping to determine which autoclass (e.g AutoModel vs TFAutoModel) to return for a given task. This mapping relies on the following branching logic:

if is_torch_available():
    _TASKS_TO_AUTOMODELS = {
        "default": AutoModel, ...
    }
elif is_tf_available():
    _TASKS_TO_AUTOMODELS = {
        "default": TFAutoModel, ...
    }
else:
    _TASKS_TO_AUTOMODELS = {}

As a result, if a user has torch and tensorflow installed, we return an AutoModel class instead of the desired TFAutoModel class. In particular, Colab users cannot export pure TensorFlow models because torch is installed by default.

Proposal

To address this issue, I've introduced a new framework argument in the ONNX CLI and extended _TASKS_TO_AUTOMODELS to be a nested dict when both frameworks are installed. With this change, one can now export pure TensorFlow models with:

python -m transformers.onnx --model=keras-io/transformers-qa --framework=tf onnx/

Similarly, pure PyTorch models can be exported as follows:

python -m transformers.onnx --model=lewtun/bert-finetuned-squad --framework=pt onnx/

And checkpoints with both sets of weights also works:

python -m transformers.onnx --model=distilbert-base-uncased onnx/

Although the implementation works, I'm not entirely happy with it because _TASKS_TO_AUTOMODELS changes (flat vs nested) depending on the installation environment, and this feels hacky.

Alternative solution 1

Thanks to a tip from @stas00, one solution is to change nothing and get the user to specify which framework they're using as an environment variable, e.g.

USE_TORCH=0 USE_JAX=0 USE_TF=1 python -m transformers.onnx --model=keras-io/transformers-qa onnx/

If we adopt this approach, we could provide a warning when both torch and tensorflow are installed and suggest an example like the one above.

Alternative solution 2

It occurred to me that we can solve this with a simple try/except in FeaturesManager.get_model_from_feature() as follows:

def get_model_from_feature(feature: str, model: str) -> Union[PreTrainedModel, TFPreTrainedModel]:
    # By default we return `AutoModel` if `torch` and `tensorflow` are installed
    model_class = FeaturesManager.get_model_class_for_feature(feature)
    try:
        model = model_class.from_pretrained(model)
    except OSError:
        # Load the TensorFlow weights in `AutoModel`
        model = model_class.from_pretrained(model, from_tf=True)
    return model

The user will still see a 404 error in the logs

python -m transformers.onnx --model=keras-io/transformers-qa onnx/
# 404 Client Error: Entry Not Found for url: https://huggingface.co/keras-io/transformers-qa/resolve/main/pytorch_model.bin

but the conversion to ONNX will work once the TensorFlow weights are loaded in the AutoModel instance. Note: this solution seems to be similar to the one adopted in the pipeline() function, e.g.

from transformers import pipeline

# Load a pure TensorFlow model => see 404 Client Error in logs, but pipeline loads fine
p = pipeline("question-answering", model="keras-io/transformers-qa")

The advantage of this approach is that the user doesn't have to manually specify a --framework arg, i.e. it "just works". The only drawback I see is that there might be differences between the torch.onnx and tf2onnx packages used for the ONNX export, and by using torch.onnx as the default we may mislead users on where to debug their exports. However, this is probably a rare case and could be revisited if users report problems.

Feedback on which approach is preferred is much appreciated!

@HuggingFaceDocBuilder
Copy link

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@lewtun lewtun requested a review from LysandreJik February 11, 2022 08:51
logger = logging.get_logger(__name__) # pylint: disable=invalid-name

if is_torch_available():
if is_torch_available() and not is_tf_available():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra condition is used to check if we're in a pure torch environment


class FeaturesManager:
if is_torch_available():
if is_torch_available() and not is_tf_available():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bit of duplicate logic in this module - perhaps the autoclass imports above should moved directly within FeaturesManager?

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think the solution 2 would be better for the user, as it "just works". We can investigate the provenance of the error log and try to remove it if it's an issue, but it would be better than adding a new arg :-)

AutoModelForTokenClassification,
)
elif is_tf_available():
elif is_tf_available() and not is_torch_available():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole logic of having three tests can be simplified if you just change that elif to a simple if.


class FeaturesManager:
if is_torch_available():
if is_torch_available() and not is_tf_available():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, instead of having three tests, why not always have _TASKS_TO_AUTOMODELS be a nested dict with frameworks, and you then fill the frameworks when each framework if available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a nice idea - thanks! In the end we may not need this if we adopt solution 2 :)

@lewtun
Copy link
Member Author

lewtun commented Feb 11, 2022

I personally think the solution 2 would be better for the user, as it "just works". We can investigate the provenance of the error log and try to remove it if it's an issue, but it would be better than adding a new arg :-)

Thanks for the feedback @sgugger ❤️ ! Having thought about it a bit more, I agree that solution 2 is the simplest and less error-prone: I've opened a PR for this here #15625

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants