Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[torchbench] doctr_reco_predictor fails to run inference on dynamo. #6832

Closed
ysiraichi opened this issue Mar 27, 2024 · 1 comment
Closed
Labels

Comments

@ysiraichi
Copy link
Collaborator

ysiraichi commented Mar 27, 2024

🐛 Bug

Running the upstreamed benchmarking scripts with the following command results in an unexpected error.

python xla/benchmarks/experiment_runner.py \
       --suite-name torchbench \
       --accelerator cuda \
       --xla PJRT \
       --dynamo openxla \
       --test eval \
       --repeat 8 --iterations-per-run 1 \
       --print-subprocess \
       --no-resume -k speech_transformer
Traceback (most recent call last):
  File "xla/benchmarks/experiment_runner.py", line 945, in <module>
    main()
  File "xla/benchmarks/experiment_runner.py", line 941, in main
    runner.run()
  File "xla/benchmarks/experiment_runner.py", line 61, in run
    self.run_single_config()
  File "xla/benchmarks/experiment_runner.py", line 256, in run_single_config
    metrics, last_output = self.run_once_and_gather_metrics(
  File "xla/benchmarks/experiment_runner.py", line 345, in run_once_and_gather_metrics
    output, _ = loop(iter_fn=self._default_iter_fn)
  File "xla/benchmarks/experiment_runner.py", line 302, in loop
    output, timing, trace = iter_fn(benchmark_experiment, benchmark_model,
  File "xla/benchmarks/experiment_runner.py", line 218, in _default_iter_fn
    output = benchmark_model.model_iter_fn(
  File "torch/_dynamo/eval_frame.py", line 390, in _fn
    return fn(*args, **kwargs)
  File "xla/benchmarks/benchmark_model.py", line 170, in eval
    pred = self.module(*inputs)
  File "torch/nn/modules/module.py", line 1527, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "torch/nn/modules/module.py", line 1536, in _call_impl
    return forward_call(*args, **kwargs)
  File "/lib/python3.8/site-packages/doctr/models/recognition/crnn/pytorch.py", line 224, in forward
    out["preds"] = self.postprocessor(logits)
  File "/lib/python3.8/site-packages/doctr/models/recognition/crnn/pytorch.py", line 97, in __call__
    return self.ctc_best_path(logits=logits, vocab=self.vocab, blank=len(self.vocab))
  File "/lib/python3.8/site-packages/doctr/models/recognition/crnn/pytorch.py", line 55, in ctc_best_path
    @staticmethod
  File "torch/_dynamo/eval_frame.py", line 390, in _fn
    return fn(*args, **kwargs)
  File "torch/_dynamo/external_utils.py", line 36, in inner
    return fn(*args, **kwargs)
  File "torch/_functorch/aot_autograd.py", line 917, in forward
    return compiled_fn(full_args)
  File "torch/_functorch/_aot_autograd/utils.py", line 89, in g
    return f(*args)
  File "torch/_functorch/_aot_autograd/runtime_wrappers.py", line 107, in runtime_wrapper
    all_outs = call_func_at_runtime_with_args(
  File "torch/_functorch/_aot_autograd/utils.py", line 113, in call_func_at_runtime_with_args
    out = normalize_as_list(f(args))
  File "torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py", line 181, in rng_functionalization_wrapper
    return compiled_fw(args)
  File "torch/_functorch/_aot_autograd/utils.py", line 89, in g
    return f(*args)
  File "torch/_dynamo/backends/torchxla.py", line 36, in fwd
    compiled_graph = bridge.extract_compiled_graph(model, args)
  File "xla/torch_xla/core/dynamo_bridge.py", line 617, in extract_compiled_graph
    xm.mark_step()
  File "xla/torch_xla/core/xla_model.py", line 1056, in mark_step
    torch_xla._XLAC._xla_step_marker(
RuntimeError: Bad StatusOr access: INTERNAL: during context [Unknown]: Seen floating point types of different precisions in %concatenate.10776 = f32[4,64,128]{2,1,0} concatenate(f16[1,64,128]{2,1,0} %reshape.10772, f16[1,64,128]{2,1,0} %reshape.10773, f32[1,64,128]{2,1,0} %reshape.10774, f32[1,64,128]{2,1,0} %reshape.10775), dimensions={0}, but mixed precision is disallowed.

Environment

  • PyTorch Commit: a52b4e22571507abc35c2d47de138497190d2e0a
  • PyTorch/XLA Commit: 84e7feb
  • PyTorch/benchmark Commit: d6015d42d9a1834bc7595c4bd6852562fb80b30b

cc @miladm @JackCaoG @vanbasten23 @zpcore @frgossen @golechwierowicz @cota

@zpcore
Copy link
Collaborator

zpcore commented Apr 4, 2024

Same cause of #6831, close for now.

@zpcore zpcore closed this as completed Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants