-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llvm] Add API to construct a benchmark from clang invocation. #577
[llvm] Add API to construct a benchmark from clang invocation. #577
Conversation
Codecov Report
@@ Coverage Diff @@
## development #577 +/- ##
===============================================
+ Coverage 88.36% 88.54% +0.17%
===============================================
Files 125 127 +2
Lines 7565 7761 +196
===============================================
+ Hits 6685 6872 +187
- Misses 880 889 +9
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super Cool!
I am wondering is that environment compatible with LlvmEnv
or GccEnv
? I mean can we directly call the existing observation or action functions on that environment?
uri = ( | ||
f"benchmark://clang-v0/{urllib.parse.quote_plus(shlex.join(command_line))}" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dataset or benchmark needs to be named clang-v0
?
For example, the command line: | ||
|
||
>>> benchmark = env.make_benchmark_from_command_line( | ||
... ["gcc", "-DNDEBUG", "a.c", "b.c", "-o", "foo", "-lm"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if "gcc" works here... as down below we add -Xclang
to the arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this function takes an argument replace_driver
(default=True) that swaps out whatever the first argument is for clang. So this will work:
>>> env.make_benchmark_from_command_line(["gcc", "in.c"])
and will get translated to something like:
/path/to/compiler_gym/llvm/bin/clang in.c -c -emit-llvm -o - -Xclang disable-llvm-optmzns
but this will not:
>>> env.make_benchmark_from_command_line(["gcc", "in.c"], replace_driver=False)
The idea with replace_driver=False
is that it allows you to use your own compiler to produce the input bitcode, providing your compiler produces compatible bitcode.
The downside is that it won't work with command line options that aren't the same between whatever compiler the user uses and our version of clang.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll extend the docstring to explain this better
DRIVER_NAMES = ( | ||
"c89", | ||
"c99", | ||
"cc", | ||
"gcc", | ||
"c++", | ||
"g++", | ||
"xgcc", | ||
"clang", | ||
"clang++", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is this library literary gccinvokcation
or is it also a generic C compiler invocator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've hacked on it to add support for clang by adding support for a few clang-specific flags, though I'm sure I've missed some
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want to change the name of the third-party library, though if my edits work well then I could submit them upstream and consider renaming it to "ccinvocation" or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Quick question though. How does it deal with mixed .c and .o files? E.g.:
gcc foo.c bar.o -o foo
My guess is that bar.o will be considered a source and that clang will barf trying to bitcode-ise it. Should those be filtered out for linking?
1018a74
to
e7c6f44
Compare
Good question! I added support so that The current implementation for this will not work if the CompilerGym backend and frontend are on different machines. To fix that, we need #325. Cheers, |
This adds a new API for constructing a benchmark from a command line invocation of a compiler. For example: >>> benchmark = env.make_benchmark_from_command_line( ... ["gcc", "in.c", "-DNDEBUG"] ... ) This provides control over the build, equivalent to: >>> benchmark = env.make_benchmark(["in.c"], copts=["-NDEBUG"]) The idea is to make it easier to integrate CompilerGym into an existing build system, as it is normally possible to dump a verbose log of all of the compiler invocations run during a build.
This enables commandlines with a combination of source and objectfile inputs to be supported.
eb8754f
to
9318df7
Compare
9318df7
to
8573d81
Compare
This release adds a new compiler environment, new APIs, and a suite of backend improvements to improve the flexibility of CompilerGym environments. Many thanks to code contributors: @sogartar, @KyleHerndon, @SoumyajitKarmakar, @uduse, and @anthony0727! Highlights of this release include: - [mlir] Began work on a new environment for matrix multiplication using MLIR ([#652](#652), thanks @KyleHerndon and @sogartar!). Note this environment is not yet included in the pypi package and must be [compiled from source](https://github.com/facebookresearch/CompilerGym/blob/development/INSTALL.md#building-from-source-with-cmake). - [llvm] Added a new `env.benchmark_from_clang_invocation()` method ([#577](#577)) that can be used for constructing LLVM environment automatically from C/C++ compiler invocations. This makes it much easier to integrate CompilerGym with your existing build scripts. - Added three new wrapper classes: `Counter`, that provides op counts for analysis ([#683](#683)); `SynchronousSqliteLogger`, that provides logging of environment interactions to a relational database ([#679](#679)), and `ForkOnStep` that provides an `undo()` operation ([#682](#682)). - Added `reward_space` and `observation_space` parameters to `env.reset()` ([#659](#659), thanks @SoumyajitKarmakar!) This release includes a number of improvements to the backend APIs that make it easier to write new CompilerGym environments: - Refactored the backend to make `CompilerEnv` an abstract interface, and `ClientServiceCompilerEnv` the concrete implementation of this interface. This enables new environments to be implemented without using gRPC ([#633](#633), thanks @sogartar!). - Extended the support for different types of action and observation spaces ([#641](#641), [#643](#643), thanks @sogartar!), including new `Permutation` and `SpaceSequence` spaces ([#645](#645), thanks @sogartar!).. - Added a new `disk/` subdirectory to compiler service's working directories, which is symlinked to an on-disk location for devices which support in-memory working directories. This fixes a bug with leftover temporary directories from LLVM ([#672](#672)). This release also includes numerous bug fixes and improvements, many of which were reported or fixed by the community. For example, fixing a bug in cache file locations ([#656](#656), thanks @uduse!), and a missing flag definition in example code ([#684](#684), thanks @anthony0727!). **Full Changelog**: v0.2.3...v0.2.4 This release brings in deprecating changes to the core `env.step()` routine, and lays the groundwork for enabling new types of compiler optimizations to be exposed through CompilerGym. Many thanks to code contributors: @mostafaelhoushi, @sogartar, @KyleHerndon, @uduse, @parthchadha, and @xtremey! Highlights of this release include: - Added a new `TextSizeInBytes` observation space for LLVM ([#575](#575)). * Added a new PPO leaderboard entry ([#580](#580). Thanks @xtremey! - Fixed a bug in which temporary directories created by the LLVM environment were not cleaned up ([#592](#592)). - **[Backend]** The function `createAndRunCompilerGymService` now returns an int, which is the exit return code ([#592](#592)). - Improvements to the examples documentation ([#548](#548)) and FAQ ([#586](#586)) Deprecations and breaking changes: - `CompilerEnv.step` no longer accepts a list of actions ([#627](#627)). A new method, `CompilerEnv.multistep` provides this functionality. This is to provide compatibility with environments whose action spaces are lists. To update your code, replace any calls to `env.step()` which take a list of actions to use `env.multistep()`. Thanks @sogartar! - The arguments `observations` and `rewards` to `step()` have been renamed `observation_spaces` and `reward_spaces`, respectively ([#627](#627)). - `Reward.id` has been renamed `Reward.name` ([#565](#565), [#612](#612)). Thanks @parthchadha! * The backend protocol buffer schema has been updated to natively support more types of observation and action, and to support nested spaces ([#531](#531)). Thanks @sogartar!
This release adds a new compiler environment, new APIs, and a suite of backend improvements to improve the flexibility of CompilerGym environments. Many thanks to code contributors: @sogartar, @KyleHerndon, @SoumyajitKarmakar, @uduse, and @anthony0727! Highlights of this release include: - [mlir] Began work on a new environment for matrix multiplication using MLIR ([#652](#652), thanks @KyleHerndon and @sogartar!). Note this environment is not yet included in the pypi package and must be [compiled from source](https://github.com/facebookresearch/CompilerGym/blob/development/INSTALL.md#building-from-source-with-cmake). - [llvm] Added a new `env.benchmark_from_clang_invocation()` method ([#577](#577)) that can be used for constructing LLVM environment automatically from C/C++ compiler invocations. This makes it much easier to integrate CompilerGym with your existing build scripts. - Added three new wrapper classes: `Counter`, that provides op counts for analysis ([#683](#683)); `SynchronousSqliteLogger`, that provides logging of environment interactions to a relational database ([#679](#679)), and `ForkOnStep` that provides an `undo()` operation ([#682](#682)). - Added `reward_space` and `observation_space` parameters to `env.reset()` ([#659](#659), thanks @SoumyajitKarmakar!) This release includes a number of improvements to the backend APIs that make it easier to write new CompilerGym environments: - Refactored the backend to make `CompilerEnv` an abstract interface, and `ClientServiceCompilerEnv` the concrete implementation of this interface. This enables new environments to be implemented without using gRPC ([#633](#633), thanks @sogartar!). - Extended the support for different types of action and observation spaces ([#641](#641), [#643](#643), thanks @sogartar!), including new `Permutation` and `SpaceSequence` spaces ([#645](#645), thanks @sogartar!).. - Added a new `disk/` subdirectory to compiler service's working directories, which is symlinked to an on-disk location for devices which support in-memory working directories. This fixes a bug with leftover temporary directories from LLVM ([#672](#672)). This release also includes numerous bug fixes and improvements, many of which were reported or fixed by the community. For example, fixing a bug in cache file locations ([#656](#656), thanks @uduse!), and a missing flag definition in example code ([#684](#684), thanks @anthony0727!). **Full Changelog**: v0.2.3...v0.2.4 This release brings in deprecating changes to the core `env.step()` routine, and lays the groundwork for enabling new types of compiler optimizations to be exposed through CompilerGym. Many thanks to code contributors: @mostafaelhoushi, @sogartar, @KyleHerndon, @uduse, @parthchadha, and @xtremey! Highlights of this release include: - Added a new `TextSizeInBytes` observation space for LLVM ([#575](#575)). * Added a new PPO leaderboard entry ([#580](#580). Thanks @xtremey! - Fixed a bug in which temporary directories created by the LLVM environment were not cleaned up ([#592](#592)). - **[Backend]** The function `createAndRunCompilerGymService` now returns an int, which is the exit return code ([#592](#592)). - Improvements to the examples documentation ([#548](#548)) and FAQ ([#586](#586)) Deprecations and breaking changes: - `CompilerEnv.step` no longer accepts a list of actions ([#627](#627)). A new method, `CompilerEnv.multistep` provides this functionality. This is to provide compatibility with environments whose action spaces are lists. To update your code, replace any calls to `env.step()` which take a list of actions to use `env.multistep()`. Thanks @sogartar! - The arguments `observations` and `rewards` to `step()` have been renamed `observation_spaces` and `reward_spaces`, respectively ([#627](#627)). - `Reward.id` has been renamed `Reward.name` ([#565](#565), [#612](#612)). Thanks @parthchadha! * The backend protocol buffer schema has been updated to natively support more types of observation and action, and to support nested spaces ([#531](#531)). Thanks @sogartar!
This release adds a new compiler environment, new APIs, and a suite of backend improvements to improve the flexibility of CompilerGym environments. Many thanks to code contributors: @sogartar, @KyleHerndon, @SoumyajitKarmakar, @uduse, and @anthony0727! Highlights of this release include: - [mlir] Began work on a new environment for matrix multiplication using MLIR ([#652](#652), thanks @KyleHerndon and @sogartar!). Note this environment is not yet included in the pypi package and must be [compiled from source](https://github.com/facebookresearch/CompilerGym/blob/development/INSTALL.md#building-from-source-with-cmake). - [llvm] Added a new `env.benchmark_from_clang_invocation()` method ([#577](#577)) that can be used for constructing LLVM environment automatically from C/C++ compiler invocations. This makes it much easier to integrate CompilerGym with your existing build scripts. - Added three new wrapper classes: `Counter`, that provides op counts for analysis ([#683](#683)); `SynchronousSqliteLogger`, that provides logging of environment interactions to a relational database ([#679](#679)), and `ForkOnStep` that provides an `undo()` operation ([#682](#682)). - Added `reward_space` and `observation_space` parameters to `env.reset()` ([#659](#659), thanks @SoumyajitKarmakar!) This release includes a number of improvements to the backend APIs that make it easier to write new CompilerGym environments: - Refactored the backend to make `CompilerEnv` an abstract interface, and `ClientServiceCompilerEnv` the concrete implementation of this interface. This enables new environments to be implemented without using gRPC ([#633](#633), thanks @sogartar!). - Extended the support for different types of action and observation spaces ([#641](#641), [#643](#643), thanks @sogartar!), including new `Permutation` and `SpaceSequence` spaces ([#645](#645), thanks @sogartar!).. - Added a new `disk/` subdirectory to compiler service's working directories, which is symlinked to an on-disk location for devices which support in-memory working directories. This fixes a bug with leftover temporary directories from LLVM ([#672](#672)). This release also includes numerous bug fixes and improvements, many of which were reported or fixed by the community. For example, fixing a bug in cache file locations ([#656](#656), thanks @uduse!), and a missing flag definition in example code ([#684](#684), thanks @anthony0727!). **Full Changelog**: v0.2.3...v0.2.4 This release brings in deprecating changes to the core `env.step()` routine, and lays the groundwork for enabling new types of compiler optimizations to be exposed through CompilerGym. Many thanks to code contributors: @mostafaelhoushi, @sogartar, @KyleHerndon, @uduse, @parthchadha, and @xtremey! Highlights of this release include: - Added a new `TextSizeInBytes` observation space for LLVM ([#575](#575)). * Added a new PPO leaderboard entry ([#580](#580). Thanks @xtremey! - Fixed a bug in which temporary directories created by the LLVM environment were not cleaned up ([#592](#592)). - **[Backend]** The function `createAndRunCompilerGymService` now returns an int, which is the exit return code ([#592](#592)). - Improvements to the examples documentation ([#548](#548)) and FAQ ([#586](#586)) Deprecations and breaking changes: - `CompilerEnv.step` no longer accepts a list of actions ([#627](#627)). A new method, `CompilerEnv.multistep` provides this functionality. This is to provide compatibility with environments whose action spaces are lists. To update your code, replace any calls to `env.step()` which take a list of actions to use `env.multistep()`. Thanks @sogartar! - The arguments `observations` and `rewards` to `step()` have been renamed `observation_spaces` and `reward_spaces`, respectively ([#627](#627)). - `Reward.id` has been renamed `Reward.name` ([#565](#565), [#612](#612)). Thanks @parthchadha! * The backend protocol buffer schema has been updated to natively support more types of observation and action, and to support nested spaces ([#531](#531)). Thanks @sogartar!
This adds a new API for constructing a benchmark from a command line invocation of a compiler. For example, given the command line:
The
make_benchmark_from_command_line()
method interprets this command line and rewrites it to one that emits an LLVM-IR file:It then loads the resulting bitcode for use as a benchmark. When the user is done playing around with the benchmark, they call a special
compiler()
method on the benchmark which re-invokes the original command line, but this time using the bitcode as input:Putting this all together, here is an end-to-end example using CompilerGym as a substitute for LLVM's pipeline:
The idea is to make it easier to integrate CompilerGym into an existing build system, as it is normally possible to dump a verbose log of all of the compiler invocations run during a build.