-
Notifications
You must be signed in to change notification settings - Fork 14
Dumping IR after every optimization pass #528
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is nice in general to have the ability to dump LLVM IR. The existing IR channel wouldn't be very convenient if we dumped IR after each pass into the same file. On the other hand, it has a nice feature of creating a new file on each test run and it has a static counter so that I can easily find a proper module in logs.
How would these new dumps work in this situation? A simple scenario I'd like to have is to run the same test with different flags and then compare generated IRs.
omniscidb/Shared/Config.h
Outdated
@@ -83,6 +83,7 @@ struct CodegenConfig { | |||
bool null_mod_by_zero = false; | |||
bool hoist_literals = true; | |||
bool enable_filter_function = true; | |||
short dump_after_all_severity{0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd propose to move it to debug config structure. Also, rename it to dump_llvm_ir_after_each_pass
or similar, simple dump
is too generic since we have different IRs on various levels.
llvm::Any IR, | ||
const llvm::PreservedAnalyses&) -> void { | ||
std::error_code ec; | ||
llvm::raw_fd_ostream os("IR_AFTER_" + std::to_string(pass_counter++) + "_" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it rewrite or append?
@@ -1001,6 +1001,13 @@ int main(int argc, char* argv[]) { | |||
"Dump IR and PTX for all executed queries to file." | |||
" Currently only supports single node tests."); | |||
|
|||
desc.add_options()("dump-after-all", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have it in ConfigBuilder
so that it would be more widely available? It was supposed to replace all these options in each separate test binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, two minor comments
opt_desc.add_options()("dump-after-all", | ||
po::value<short>(&config_->debug.dump_llvm_ir_after_each_pass) | ||
->default_value(config_->debug.dump_llvm_ir_after_each_pass) | ||
->implicit_value(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect the implicit value to be a non-zero value because if you specify the option it means you want the dumps to be generated.
size_t pass_counter{1}; | ||
std::string ir_dump_dir{}; | ||
if (co.dump_llvm_ir_after_each_pass) { | ||
work_unit_meta.w_unit_counter++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going to happen in the heterogeneous mode when you potentially have two compilations running simultaneously? Should the work_unit_meta
live in the work unit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the timestamp can be saved in the module for debug purposes and then you can ask for it in optimize_ir()
. But there are lots of ways to get into optimize_ir()
, so where do you embed the timestamp into the module?
E.g., you can embed the timestamp in Executor::compileWorkUnit()
, but CPU codegen path might go through Executor::reduceMultiDeviceResultSets() -> ResultSetReductionJIT::codegen() -> ...
, omitting Executor::compileWorkUnit()
, so how many other places should have code that embeds the timestamp? Right now it seems to me that this is the least amount of code that does the job, or am I missing something?
1a4d539
to
037f2ef
Compare
037f2ef
to
78a2a99
Compare
This PR introduces a new config flag
dump-after-all
, which when set to1
will dump module-level IR before and after optimizations (IR_UNOPT
/IR_OPT
). When set to2
will dump IR before optimizations, after each transformation pass (IR_AFTER_#passCounter_#passName
) and after all transformations.The intent behind this feature was to make the debugging of problematic queries easier.
With the introduction of the new pass builder and pass managers (e.g., FunctionPassManager), the transformation passes can now run more often, but on a smaller granularity scale (e.g., earlier it was one pass for all functions, now it is one pass per function). This way we cannot "summarize" passes (e.g., after GVN, after SROAP), since now the entire functions transformation pipeline runs for each function fully, so expect many (really many) files for even seemingly simple queries when the severity is set to
2
.The example usage is in
IntelGPUEnablingTest
.You can call it like e.g.,
./IntelGPUEnablingTest --gtest_filter=JoinTest.SimpleJoin --dump-after-all=1