Support performance warning by manman-ren · Pull Request #3922 · triton-lang/triton

manman-ren · 2024-05-15T16:18:26Z

This commit adds a performance warning for not selecting MMA v3 for tl.dot on Hopper.
For the added test case, we will get:

test-warning.py:24:18: remark: Warning: can't use MMA V3 for the dot op
    c = tl.dot(a, b)
                 ^
test-warning.py:24:18: note: see current operation: %39 = tt.dot %37, %38, %cst, inputPrecision = tf32 :

Jokeren · 2024-05-15T17:20:42Z

Where is this flag coming from "-Wno-perf-warnings"?

Jokeren · 2024-05-15T17:21:47Z

Adds additional warnings obtained when compiling from LLVM to assembly (e.g., register spillage, ptx performance warnings, etc.).

Can you use remark to emit ptx warnings? Maybe not?

ptillet · 2024-05-16T14:34:08Z

python/tutorials/test-warning.py

We shouldn't have tests as tutorials :) Could you create a unit test that exercises the new codepaths and checks for the output on std::cerr?

manman-ren · 2024-05-16T15:49:40Z

Thanks for the comments! As described in the summary:

Where should we add the test case to detect the diagnostics? I can add a mlir test case with -verify-diagnostics and expected-remark. Right now, I am adding a test case under python/tutorial temporarily we will need a way to verify diagnostics are emitted at the right source line.
@ptillet Yeah I added to tutorial temporarily for discussion, I will move to the unit directory, which will be tested via pytest and I will try to figure out how to check against stdout there. If you have a test case that does similar thing, that will be great!
Do we want Remarks or Warnings? For llvm backend, remarks may work better.
Support warning flags or use env variables? It is not clear to me how to support warning flags when building the .py source code.
@Jokeren If we want to go with warning flags, I need to figure out how to support -Wno-perf-warnings when building a py code. Again if anyone has any pointer, that will be great.

Can you use remark to emit ptx warnings? Maybe not?

You mean warnings from ptxas, not sure how to get them. But we can emit remarks when lowering from llvm to ptx.

CC @joker-eph: if you have any suggestion to some of the questions, that will be great, Thanks!

Jokeren · 2024-05-16T16:56:05Z

You mean warnings from ptxas, not sure how to get them. But we can emit remarks when lowering from llvm to ptx.

My worry was that some warnings might be emitted from ptx to sass. I'm OK with either marks or warnings. I think marks actually seem more informative.

joker-eph · 2024-05-16T21:51:13Z

You mean warnings from ptxas, not sure how to get them.

You could capture the output of ptxas and regex/pattern match the output to catch these and report them as remarks?

ptillet · 2024-05-17T02:36:52Z

Where should we add the test case to detect the diagnostics

test/unit/warnings.py sounds like a good place

I will try to figure out how to check against stdout there

Just to be clear, this should check against stderr

manman-ren · 2024-05-17T15:40:50Z

+    out_messages = out_capture.getvalue()
+    sys.stderr = sys.__stderr__
+    sys.stdout = sys.__stdout__
+    print(error_messages)


This test case is not working currently. I haven't figured out how to capture the stderr in a string yet. Tried to redirect stderr, but it may be captured by pytest? The message shows:
--------------------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------------------
python/test/unit/test_warning.py:21:18: remark: Warning: can't use MMA V3 for the dot op
c = tl.dot(a, b)
^
python/test/unit/test_warning.py:21:18: note: see current operation: %46 = tt.dot ...

Will clean up this once it is solved. @htyu Any suggestion here? Thanks!

manman-ren · 2024-05-17T15:44:47Z

@ptillet About using warning flags -Wno-perf-warnings or -Werror, I feel it is not supported in the python workflow "python test.py", right? So is the suggestion about supplying the flags for triton-opt and still use env variables for the python workflow?
CC @joker-eph in case Mehdi has some comments.

htyu · 2024-05-24T17:53:58Z

+            srcMgr = llvm.source_mgr()
+            diag = ir.source_mgr_diag(srcMgr, mod.context)
+            mod.context.printOpOnDiagnostic(True)
+            #mod.context.printStackTraceOnDiagnostic(True)


nit: remove the comment?

Actually I think instead we may want to add an enable_remarks API similar to enable_debug:

triton/python/src/ir.cc

Line 1552 in 74ad278

.def("enable_debug",

@htyu Thanks for the review! Sorry for the delay. I tried to wrap things in enable_remarks:
as part of Context

py::class_<MLIRContext>(m, "context", py::module_local()) .def(py::init<>()) + .def("enable_remark", + [](MLIRContext &self) { + auto srcMgr = llvm::SourceMgr(); + auto diag = SourceMgrDiagnosticHandler(srcMgr, &self); + self.printOpOnDiagnostic(true); + })

or as part of PassManager

py::class_<PassManager>(m, "pass_manager", py::module_local()) .def(py::init<MLIRContext *>()) + .def("enable_remark", + [](PassManager &self) { + auto *context = self.getContext(); + auto srcMgr = llvm::SourceMgr(); + auto diag = SourceMgrDiagnosticHandler(srcMgr, context); + context->printOpOnDiagnostic(true); + })

Then call from make_ttgir:
with either
mod.context.enable_remark()
or
pm.enable_remark()

None of these worked. I think it is due to the liveness of SourceMgr and SourceMgrDiagnosticHandler.

Thanks for giving it a shot. Currently changes look good to me.

manman-ren · 2024-06-13T23:21:36Z

The tests are failing. I need to only enable the test for H100 with the env variable.

Jokeren · 2024-06-19T01:12:48Z

            cluster_info.clusterDimY = opt.cluster_dims[1]
            cluster_info.clusterDimZ = opt.cluster_dims[2]
+        # Set up Diagnostic
+        if os.environ.get("MLIR_ENABLE_REMARK", "0") == "1":


Can you document in README?

This commit adds a performance warning for not selecting MMA v3 for tl.dot on Hopper. For the added test case, we will get: ``` test-warning.py:24:18: remark: Warning: can't use MMA V3 for the dot op c = tl.dot(a, b) ^ test-warning.py:24:18: note: see current operation: %39 = tt.dot %37, %38, %cst, inputPrecision = tf32 : ```

manman-ren requested a review from ptillet as a code owner May 15, 2024 16:18

manman-ren marked this pull request as draft May 15, 2024 16:18

manman-ren requested review from Jokeren, bertmaher and htyu May 15, 2024 16:22

manman-ren commented May 17, 2024

View reviewed changes

manman-ren mentioned this pull request May 21, 2024

Enable remarks for ttgir lowering with SourceMgrDiagnosticHandler #3835

Closed

htyu approved these changes May 24, 2024

View reviewed changes

manman-ren force-pushed the perf-warning branch from 88d4b32 to c9949f5 Compare June 13, 2024 21:13

manman-ren marked this pull request as ready for review June 13, 2024 21:26

mren2 added 9 commits June 18, 2024 16:17

Support performance warning

de63d79

update test case

721d249

fix test case

6ec90f0

precommit hook

02d9551

rebase + address review comments

fa4599e

fix

b7df280

fix test case

a6e2ec3

fix test case

743b82c

fix test case

a31db2f

manman-ren force-pushed the perf-warning branch from 1e24ab9 to a31db2f Compare June 18, 2024 23:17

Jokeren reviewed Jun 19, 2024

View reviewed changes

add comment in README

4f0a28e

Jokeren approved these changes Jun 19, 2024

View reviewed changes

manman-ren merged commit 75b0321 into triton-lang:main Jun 19, 2024

jlebar mentioned this pull request Jun 25, 2024

Pass repr=key when calling JITFunction.cache_hook #4207

Closed

sfzhu93 mentioned this pull request Nov 25, 2024

[WIP][Performance Remarks] Support MLIR_ENABLE_REMARK for command line tools #5250

Closed

7 tasks

Conversation

manman-ren commented May 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jokeren commented May 15, 2024

Uh oh!

Jokeren commented May 15, 2024

Uh oh!

ptillet commented May 16, 2024

Uh oh!

manman-ren commented May 16, 2024

Uh oh!

Jokeren commented May 16, 2024

Uh oh!

joker-eph commented May 16, 2024

Uh oh!

ptillet commented May 17, 2024

Uh oh!

manman-ren May 17, 2024

Choose a reason for hiding this comment

Uh oh!

manman-ren commented May 17, 2024

Uh oh!

htyu May 24, 2024

Choose a reason for hiding this comment

Uh oh!

htyu May 24, 2024

Choose a reason for hiding this comment

Uh oh!

manman-ren Jun 13, 2024

Choose a reason for hiding this comment

Uh oh!

htyu Jun 13, 2024

Choose a reason for hiding this comment

Uh oh!

manman-ren commented Jun 13, 2024

Uh oh!

Jokeren Jun 19, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

manman-ren commented May 15, 2024 •

edited

Loading