Compilation of GPT model fails due to assertion error #19110

pxanthopoulos · 2024-11-12T13:06:04Z

What happened?

I tried to compile the GPT model from huggingface (after exporting it to TF format and converting it to ONNX) and the compilation failed with the following stack dump:

iree-compile: /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2868: SmallVector<Value> mlir::TypeConverter::materializeTargetConversion(OpBuilder &, Location, TypeRange, ValueRange, Type) const: Assertion `TypeRange(ValueRange(result)) == resultTypes && "callback produced incorrect number of values or values with " "incorrect types"' failed.
Please report issues to https://github.com/iree-org/iree/issues and include the crash backtrace.
Stack dump:
0.	Program arguments: iree-compile --iree-hal-target-backends=cuda --iree-cuda-target=sm_70 --dump-compilation-phases-to=./model-phases/ --iree-vm-target-index-bits=64 --iree-stream-resource-index-bits=64 model.mlir -o model.vmfb
 #0 0x00007f3e4f03866d llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /workspace/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:11
 #1 0x00007f3e4f038b5b PrintStackTraceSignalHandler(void*) /workspace/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x00007f3e4f036bc6 llvm::sys::RunSignalHandlers() /workspace/iree/third_party/llvm-project/llvm/lib/Support/Signals.cpp:105:5
 #3 0x00007f3e4f039315 SignalHandler(int) /workspace/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #4 0x00007f3e42f48520 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007f3e42f9c9fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x00007f3e42f9c9fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #7 0x00007f3e42f9c9fc pthread_kill ./nptl/pthread_kill.c:89:10
 #8 0x00007f3e42f48476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #9 0x00007f3e42f2e7f3 abort ./stdlib/abort.c:81:7
#10 0x00007f3e42f2e71b _nl_load_domain ./intl/loadmsgcat.c:1177:9
#11 0x00007f3e42f3fe96 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#12 0x00007f3e57ee3904 mlir::TypeConverter::materializeTargetConversion(mlir::OpBuilder&, mlir::Location, mlir::TypeRange, mlir::ValueRange, mlir::Type) const /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2869:5
#13 0x00007f3e57ee364c mlir::TypeConverter::materializeTargetConversion(mlir::OpBuilder&, mlir::Location, mlir::Type, mlir::ValueRange, mlir::Type) const /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2851:14
#14 0x00007f3e57ee2074 legalizeUnresolvedMaterialization(mlir::RewriterBase&, (anonymous namespace)::UnresolvedMaterializationRewrite*) /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2443:39
#15 0x00007f3e57ee10f6 mlir::OperationConverter::convertOperations(llvm::ArrayRef<mlir::Operation*>) /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2528:18
#16 0x00007f3e57ee51c9 mlir::applyPartialConversion(llvm::ArrayRef<mlir::Operation*>, mlir::ConversionTarget const&, mlir::FrozenRewritePatternSet const&, mlir::ConversionConfig) /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:3258:22
#17 0x00007f3e57ee52cd mlir::applyPartialConversion(mlir::Operation*, mlir::ConversionTarget const&, mlir::FrozenRewritePatternSet const&, mlir::ConversionConfig) /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:3264:10
#18 0x00007f3e527ad87c mlir::iree_compiler::IREE::VM::ConversionPass::runOnOperation() /workspace/iree/compiler/src/iree/compiler/Dialect/VM/Transforms/Conversion.cpp:168:16
#19 0x00007f3e4f51fdab mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_1::operator()() const /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:0:17
#20 0x00007f3e4f51fd45 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_1>(long) /workspace/iree/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:5
#21 0x00007f3e4ef4b229 llvm::function_ref<void ()>::operator()() const /workspace/iree/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:5
#22 0x00007f3e4f522bf5 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /workspace/iree/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:281:3
#23 0x00007f3e4f51b54a mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:532:17
#24 0x00007f3e4f51bad4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:592:16
#25 0x00007f3e4f51d518 mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:905:10
#26 0x00007f3e4f51d442 mlir::PassManager::run(mlir::Operation*) /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:885:60
#27 0x00007f3e4ee8fb5a mlir::iree_compiler::embed::(anonymous namespace)::Invocation::runPipeline(iree_compiler_pipeline_t) /workspace/iree/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1008:27
#28 0x00007f3e4ee8f433 ireeCompilerInvocationPipeline /workspace/iree/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1447:3
#29 0x00007f3e4f40542e mlir::iree_compiler::runIreecMain(int, char**)::$_2::operator()(iree_compiler_source_t*) const /workspace/iree/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:254:11
#30 0x00007f3e4f40486e mlir::iree_compiler::runIreecMain(int, char**) /workspace/iree/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:355:9
#31 0x00007f3e4eedc7eb ireeCompilerRunMain /workspace/iree/compiler/src/iree/compiler/API/Internal/IREECompileToolEntryPoint.cpp:12:3
#32 0x0000559177a56782 main /workspace/iree/tools/iree-compile-main.cc:9:35
#33 0x00007f3e42f2fd90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#34 0x00007f3e42f2fe40 call_init ./csu/../csu/libc-start.c:128:20
#35 0x00007f3e42f2fe40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#36 0x0000559177a56695 _start (/workspace/iree/iree-build/tools/iree-compile+0x1695)


### Steps to reproduce your issue

1. Export the GPT model using the script in this [gist](https://gist.github.com/pxanthopoulos/9f40ba1fdef3a57c9a1b13d75ca37db2#file-gpt-py).
2. Convert it to ONNX using the following command: `python -m tf2onnx.convert --saved-model ./gpt-tf/ --output model.onnx --opset 17`
3. Import it using the command `iree-import-onnx model.onnx -o model.mlir`
4. Compile it using the following command: `iree-compile --iree-hal-target-backends=cuda --iree-cuda-target=sm_70 --dump-compilation-phases-to=./model-phases/ --iree-vm-target-index-bits=64 --iree-stream-resource-index-bits=64 model.mlir -o model.vmfb`


### What component(s) does this issue relate to?

Compiler

### Version information

Commit hash of IREE: 5c45591244fe7499f37329e631ddff04493295d6
Python 3.10.12 environment from pip freeze:

absl-py==2.1.0
astunparse==1.6.3
certifi==2024.8.30
charset-normalizer==3.4.0
filelock==3.16.1
flatbuffers==24.3.25
fsspec==2024.10.0
gast==0.6.0
google-pasta==0.2.0
grpcio==1.67.1
h5py==3.12.1
huggingface-hub==0.26.2
idna==3.10
keras==3.6.0
libclang==18.1.1
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==3.0.2
mdurl==0.1.2
ml-dtypes==0.4.1
namex==0.0.8
numpy==1.26.4
nvidia-cublas-cu12==12.5.3.2
nvidia-cuda-cupti-cu12==12.5.82
nvidia-cuda-nvcc-cu12==12.5.82
nvidia-cuda-nvrtc-cu12==12.5.82
nvidia-cuda-runtime-cu12==12.5.82
nvidia-cudnn-cu12==9.3.0.75
nvidia-cufft-cu12==11.2.3.61
nvidia-curand-cu12==10.3.6.82
nvidia-cusolver-cu12==11.6.3.83
nvidia-cusparse-cu12==12.5.1.3
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.5.82
onnx==1.17.0
opt_einsum==3.4.0
optree==0.13.0
packaging==24.2
protobuf==3.20.3
Pygments==2.18.0
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
rich==13.9.4
safetensors==0.4.5
six==1.16.0
tensorboard==2.18.0
tensorboard-data-server==0.7.2
tensorflow==2.18.0
tensorflow-io-gcs-filesystem==0.37.1
termcolor==2.5.0
tf2onnx==1.16.1
tf_keras==2.18.0
tokenizers==0.20.3
tqdm==4.67.0
transformers==4.46.2
typing_extensions==4.12.2
urllib3==2.2.3
Werkzeug==3.1.3
wrapt==1.16.0


### Additional context

For my build environment (build commands, dockerfile, etc) view the additional context in this [issue](https://github.com/iree-org/iree/issues/18767).

The text was updated successfully, but these errors were encountered:

pxanthopoulos added the bug 🐞 Something isn't working label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilation of GPT model fails due to assertion error #19110

Compilation of GPT model fails due to assertion error #19110

pxanthopoulos commented Nov 12, 2024

Compilation of GPT model fails due to assertion error #19110

Compilation of GPT model fails due to assertion error #19110

Comments

pxanthopoulos commented Nov 12, 2024

What happened?