Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation of GPT model fails due to assertion error #19110

Open
pxanthopoulos opened this issue Nov 12, 2024 · 0 comments
Open

Compilation of GPT model fails due to assertion error #19110

pxanthopoulos opened this issue Nov 12, 2024 · 0 comments
Labels
bug 🐞 Something isn't working

Comments

@pxanthopoulos
Copy link

What happened?

I tried to compile the GPT model from huggingface (after exporting it to TF format and converting it to ONNX) and the compilation failed with the following stack dump:

iree-compile: /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2868: SmallVector<Value> mlir::TypeConverter::materializeTargetConversion(OpBuilder &, Location, TypeRange, ValueRange, Type) const: Assertion `TypeRange(ValueRange(result)) == resultTypes && "callback produced incorrect number of values or values with " "incorrect types"' failed.
Please report issues to https://github.com/iree-org/iree/issues and include the crash backtrace.
Stack dump:
0.	Program arguments: iree-compile --iree-hal-target-backends=cuda --iree-cuda-target=sm_70 --dump-compilation-phases-to=./model-phases/ --iree-vm-target-index-bits=64 --iree-stream-resource-index-bits=64 model.mlir -o model.vmfb
 #0 0x00007f3e4f03866d llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /workspace/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:11
 #1 0x00007f3e4f038b5b PrintStackTraceSignalHandler(void*) /workspace/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x00007f3e4f036bc6 llvm::sys::RunSignalHandlers() /workspace/iree/third_party/llvm-project/llvm/lib/Support/Signals.cpp:105:5
 #3 0x00007f3e4f039315 SignalHandler(int) /workspace/iree/third_party/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #4 0x00007f3e42f48520 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007f3e42f9c9fc __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x00007f3e42f9c9fc __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #7 0x00007f3e42f9c9fc pthread_kill ./nptl/pthread_kill.c:89:10
 #8 0x00007f3e42f48476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #9 0x00007f3e42f2e7f3 abort ./stdlib/abort.c:81:7
#10 0x00007f3e42f2e71b _nl_load_domain ./intl/loadmsgcat.c:1177:9
#11 0x00007f3e42f3fe96 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#12 0x00007f3e57ee3904 mlir::TypeConverter::materializeTargetConversion(mlir::OpBuilder&, mlir::Location, mlir::TypeRange, mlir::ValueRange, mlir::Type) const /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2869:5
#13 0x00007f3e57ee364c mlir::TypeConverter::materializeTargetConversion(mlir::OpBuilder&, mlir::Location, mlir::Type, mlir::ValueRange, mlir::Type) const /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2851:14
#14 0x00007f3e57ee2074 legalizeUnresolvedMaterialization(mlir::RewriterBase&, (anonymous namespace)::UnresolvedMaterializationRewrite*) /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2443:39
#15 0x00007f3e57ee10f6 mlir::OperationConverter::convertOperations(llvm::ArrayRef<mlir::Operation*>) /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:2528:18
#16 0x00007f3e57ee51c9 mlir::applyPartialConversion(llvm::ArrayRef<mlir::Operation*>, mlir::ConversionTarget const&, mlir::FrozenRewritePatternSet const&, mlir::ConversionConfig) /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:3258:22
#17 0x00007f3e57ee52cd mlir::applyPartialConversion(mlir::Operation*, mlir::ConversionTarget const&, mlir::FrozenRewritePatternSet const&, mlir::ConversionConfig) /workspace/iree/third_party/llvm-project/mlir/lib/Transforms/Utils/DialectConversion.cpp:3264:10
#18 0x00007f3e527ad87c mlir::iree_compiler::IREE::VM::ConversionPass::runOnOperation() /workspace/iree/compiler/src/iree/compiler/Dialect/VM/Transforms/Conversion.cpp:168:16
#19 0x00007f3e4f51fdab mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_1::operator()() const /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:0:17
#20 0x00007f3e4f51fd45 void llvm::function_ref<void ()>::callback_fn<mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int)::$_1>(long) /workspace/iree/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:5
#21 0x00007f3e4ef4b229 llvm::function_ref<void ()>::operator()() const /workspace/iree/third_party/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:5
#22 0x00007f3e4f522bf5 void mlir::MLIRContext::executeAction<mlir::PassExecutionAction, mlir::Pass&>(llvm::function_ref<void ()>, llvm::ArrayRef<mlir::IRUnit>, mlir::Pass&) /workspace/iree/third_party/llvm-project/mlir/include/mlir/IR/MLIRContext.h:281:3
#23 0x00007f3e4f51b54a mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:532:17
#24 0x00007f3e4f51bad4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:592:16
#25 0x00007f3e4f51d518 mlir::PassManager::runPasses(mlir::Operation*, mlir::AnalysisManager) /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:905:10
#26 0x00007f3e4f51d442 mlir::PassManager::run(mlir::Operation*) /workspace/iree/third_party/llvm-project/mlir/lib/Pass/Pass.cpp:885:60
#27 0x00007f3e4ee8fb5a mlir::iree_compiler::embed::(anonymous namespace)::Invocation::runPipeline(iree_compiler_pipeline_t) /workspace/iree/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1008:27
#28 0x00007f3e4ee8f433 ireeCompilerInvocationPipeline /workspace/iree/compiler/src/iree/compiler/API/Internal/CompilerDriver.cpp:1447:3
#29 0x00007f3e4f40542e mlir::iree_compiler::runIreecMain(int, char**)::$_2::operator()(iree_compiler_source_t*) const /workspace/iree/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:254:11
#30 0x00007f3e4f40486e mlir::iree_compiler::runIreecMain(int, char**) /workspace/iree/compiler/src/iree/compiler/Tools/iree_compile_lib.cc:355:9
#31 0x00007f3e4eedc7eb ireeCompilerRunMain /workspace/iree/compiler/src/iree/compiler/API/Internal/IREECompileToolEntryPoint.cpp:12:3
#32 0x0000559177a56782 main /workspace/iree/tools/iree-compile-main.cc:9:35
#33 0x00007f3e42f2fd90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#34 0x00007f3e42f2fe40 call_init ./csu/../csu/libc-start.c:128:20
#35 0x00007f3e42f2fe40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#36 0x0000559177a56695 _start (/workspace/iree/iree-build/tools/iree-compile+0x1695)


### Steps to reproduce your issue

1. Export the GPT model using the script in this [gist](https://gist.github.com/pxanthopoulos/9f40ba1fdef3a57c9a1b13d75ca37db2#file-gpt-py).
2. Convert it to ONNX using the following command: `python -m tf2onnx.convert --saved-model ./gpt-tf/ --output model.onnx --opset 17`
3. Import it using the command `iree-import-onnx model.onnx -o model.mlir`
4. Compile it using the following command: `iree-compile --iree-hal-target-backends=cuda --iree-cuda-target=sm_70 --dump-compilation-phases-to=./model-phases/ --iree-vm-target-index-bits=64 --iree-stream-resource-index-bits=64 model.mlir -o model.vmfb`


### What component(s) does this issue relate to?

Compiler

### Version information

Commit hash of IREE: 5c45591244fe7499f37329e631ddff04493295d6
Python 3.10.12 environment from pip freeze:

absl-py==2.1.0
astunparse==1.6.3
certifi==2024.8.30
charset-normalizer==3.4.0
filelock==3.16.1
flatbuffers==24.3.25
fsspec==2024.10.0
gast==0.6.0
google-pasta==0.2.0
grpcio==1.67.1
h5py==3.12.1
huggingface-hub==0.26.2
idna==3.10
keras==3.6.0
libclang==18.1.1
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe==3.0.2
mdurl==0.1.2
ml-dtypes==0.4.1
namex==0.0.8
numpy==1.26.4
nvidia-cublas-cu12==12.5.3.2
nvidia-cuda-cupti-cu12==12.5.82
nvidia-cuda-nvcc-cu12==12.5.82
nvidia-cuda-nvrtc-cu12==12.5.82
nvidia-cuda-runtime-cu12==12.5.82
nvidia-cudnn-cu12==9.3.0.75
nvidia-cufft-cu12==11.2.3.61
nvidia-curand-cu12==10.3.6.82
nvidia-cusolver-cu12==11.6.3.83
nvidia-cusparse-cu12==12.5.1.3
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.5.82
onnx==1.17.0
opt_einsum==3.4.0
optree==0.13.0
packaging==24.2
protobuf==3.20.3
Pygments==2.18.0
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
rich==13.9.4
safetensors==0.4.5
six==1.16.0
tensorboard==2.18.0
tensorboard-data-server==0.7.2
tensorflow==2.18.0
tensorflow-io-gcs-filesystem==0.37.1
termcolor==2.5.0
tf2onnx==1.16.1
tf_keras==2.18.0
tokenizers==0.20.3
tqdm==4.67.0
transformers==4.46.2
typing_extensions==4.12.2
urllib3==2.2.3
Werkzeug==3.1.3
wrapt==1.16.0


### Additional context

For my build environment (build commands, dockerfile, etc) view the additional context in this [issue](https://github.com/iree-org/iree/issues/18767).
@pxanthopoulos pxanthopoulos added the bug 🐞 Something isn't working label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant