Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iree-opt does not properly include backend (amdgpu) dependent dialect for standalone (iree-gpu-lower-ops) pass #19081

Open
jerryyin opened this issue Nov 8, 2024 · 0 comments
Labels
bug 🐞 Something isn't working

Comments

@jerryyin
Copy link
Member

jerryyin commented Nov 8, 2024

What happened?

For end to end compilation with a simple batch gemm test case, I am able to confirm the test case can compile to finish. In order to observe what iree-gpu-lower-ops does (around how it lower a multi_mma op to mfma op), I intercepted the IR and manually compile it via iree-opt.

iree-opt ended up crashing with error code:

LLVM ERROR: Building op amdgpu.mfma but it isn't known in this MLIRContext: the dialect may not be loaded or this operation hasn't been added by the dialect.

Upon investigation, a few evidence indicate that this shouldn't happen, including:

  1. iree-opt --show-dialects indeed print amdgpu as part of the available dialects.
  2. iree-opt --allow-unregistered-dialect cannot prevent the pass from crashing iree-opt
  3. Observe that registerAllDialects(registry) is already invoked from iree-opt per
    mlir::iree_compiler::registerAllDialects(registry);
    • Manually add registry.insert<amdgpu::AMDGPUDialect>(); will continue to yield a crash from iree-opt
  4. What ended up fixing the the crash is via including amdgpu dialect as a dependency at:
    def LowerIREEGPUOpsPass :
    InterfacePass<"iree-gpu-lower-ops", "mlir::FunctionOpInterface"> {
    let summary = "Post bufferization lowerings of iree_gpu ops before late lowerings";
    let dependentDialects = [
    "::mlir::gpu::GPUDialect",
    ];
    }
    • However this is undesirable because then all transform passes will require every backend to be added as its dependency.

Tasks

Upon discussion in discord, the conclusion is that following needs to be fixed:

  1. Figure out why registerAllDialects() doesn't include the amdgpu dialect in iree-opt standalone pass invokation.
  2. Figure out why amdgpu target isn't already included via ROCMTarget that registers rocdl dialect.
  3. Fix dialect registration such that an arbitrary backend dependent pass can be invoked without having to manually add pass level dependencies
    • An alternative is to provide --register-dialect as a plumbing for user to register dialect in passes especially in Transforms folder
    • Note: The workaround specified in 4 is undesirable and is not considered as a proper fix to this ticket.

Steps to reproduce your issue

  1. Download before_iree_gpu_lower_ops.txt
    the IR from print-after-all right before this pass happens
  2. Use iree-opt to apply just this standalone pass via:
    iree-opt --iree-hal-target-backends=rocm --pass-pipeline="builtin.module(func.func(iree-gpu-lower-ops))" --iree-hip-target=gfx942 before_iree_gpu_lower_ops.mlir
  3. Observe the iree-opt crash when iree-gpu-lower-ops attempt to convert a iree_gpu.multi_mma op to amdgpu.mfma op.
@jerryyin jerryyin added the bug 🐞 Something isn't working label Nov 8, 2024
@jerryyin jerryyin changed the title iree-opt does not properly include backend (amdgpu) dependent dialect for standalone (iree-gpu-lower-ops) pass iree-opt does not properly include backend (amdgpu) dependent dialect for standalone (iree-gpu-lower-ops) pass Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
Status: No status
Development

No branches or pull requests

1 participant