Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[leaderboard] Move leaderboard utility modules into compiler_gym namespace #161

Merged
merged 6 commits into from
Mar 31, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 20 additions & 14 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,26 @@ We actively welcome your pull requests.
("CLA").


## Leaderboard Submissions

To add a new result to the leaderboard, add a new entry to the leaderboard table
and file a [Pull Request](#pull-requests). Please include:

1. A list of all authors.
2. A CSV file of your results. The
[compiler_gym.leaderboard](https://facebookresearch.github.io/CompilerGym/compiler_gym/leaderboard.html)
package provides utilities to help generate results using your agent.
3. A write-up of your approach. You may use the
[submission template](/leaderboard/SUBMISSION_TEMPLATE.md) as a guide.

We do not require that you submit the source code for your approach. Once you
submit your pull request we will validate your results CSV files and may ask
clarifying questions if we feel that those would be useful to improve
reproducibility. Please [take a look
here](https://github.com/facebookresearch/CompilerGym/pull/127) for an example
of a well-formed pull request submission.


## Code Style

We want to ease the burden of code formatting using tools. Our code style
Expand All @@ -62,20 +82,6 @@ Other common sense rules we encourage are:
easy-to-write.


## Leaderboard Submissions

To add a new result to the leaderboard, add a new entry to the leaderboard table
and file a [Pull Request](#pull-requests). Please include:

1. A list of all authors.
2. A CSV file of your results.
3. A write-up of your approach. You may use the
[submission template](/leaderboard/SUBMISSION_TEMPLATE.md) as a guide.

Please [take a look
here](https://github.com/facebookresearch/CompilerGym/pull/127) for an example
of a well-formed pull request submission.

## Contributor License Agreement ("CLA")

In order to accept your pull request, we need you to submit a CLA. You
Expand Down
25 changes: 13 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,11 @@ developers to expose new optimization problems for AI.
- [Installation](#installation)
- [Trying it out](#trying-it-out)
- [Leaderboards](#leaderboards)
- [llvm-ic-v0](#llvm-ic-v0)
- [LLVM Instruction Count](#llvm-instruction-count)
- [Contributing](#contributing)
- [Citation](#citation)


# Getting Started

Starting with CompilerGym is simple. If you not already familiar with the gym
Expand Down Expand Up @@ -158,24 +160,23 @@ CompilerGym tasks. To submit a result please see
[this document](https://github.com/facebookresearch/CompilerGym/blob/development/CONTRIBUTING.md#leaderboard-submissions).


## llvm-ic-v0

LLVM is a popular open source compiler used widely in industry and research.
This environment exposes the optimization pipeline as a set of actions that can
be applied to a particular program. The goal of the agent is to select the
sequence of optimizations that lead to the greatest reduction in instruction
count in the program being compiled. Reward is the reduction in codesize
achieved scaled to the reduction achieved by LLVM's builtin `-Oz` pipeline.
## LLVM Instruction Count

### cBench-v1 <!-- omit in toc -->
LLVM is a popular open source compiler used widely in industry and research. The
`llvm-ic-v0` environment exposes LLVM's optimizing passes as a set of actions
that can be applied to a particular program. The goal of the agent is to select
the sequence of optimizations that lead to the greatest reduction in instruction
count in the program being compiled. Reward is the reduction in instruction
count achieved scaled to the reduction achieved by LLVM's builtin `-Oz`
pipeline.

This leaderboard tracks the results achieved by algorithms on the `llvm-ic-v0`
environment on the 23 benchmarks in the `cBench-v1` dataset.

| Author | Algorithm | Links | Date | Walltime (mean) | Codesize Reduction (geomean) |
| --- | --- | --- | --- | --- | --- |
| Facebook | Greedy search | [write-up](leaderboard/llvm_codesize/e_greedy/README.md), [results](leaderboard/llvm_codesize/e_greedy/results_e0.csv) | 2021-03 | 169.237s | 1.055× |
| Facebook | e-Greedy search (e=0.1) | [write-up](leaderboard/llvm_codesize/e_greedy/README.md), [results](leaderboard/llvm_codesize/e_greedy/results_e10.csv) | 2021-03 | 152.579s | 1.041× |
| Facebook | Greedy search | [write-up](leaderboard/llvm_instcount/e_greedy/README.md), [results](leaderboard/llvm_instcount/e_greedy/results_e0.csv) | 2021-03 | 169.237s | 1.055× |
| Facebook | e-Greedy search (e=0.1) | [write-up](leaderboard/llvm_instcount/e_greedy/README.md), [results](leaderboard/llvm_instcount/e_greedy/results_e10.csv) | 2021-03 | 152.579s | 1.041× |

# Contributing

Expand Down
3 changes: 2 additions & 1 deletion compiler_gym/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,15 @@ py_library(
":random_search",
":validate",
"//compiler_gym/envs",
"//compiler_gym/leaderboard",
"//compiler_gym/util",
],
)

py_library(
name = "compiler_env_state",
srcs = ["compiler_env_state.py"],
visibility = ["//compiler_gym/envs:__subpackages__"],
visibility = ["//compiler_gym:__subpackages__"],
)

py_library(
Expand Down
26 changes: 26 additions & 0 deletions compiler_gym/leaderboard/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
load("@rules_python//python:defs.bzl", "py_library")

py_library(
name = "leaderboard",
srcs = ["__init__.py"],
visibility = ["//visibility:public"],
deps = [
":llvm_instcount",
],
)

py_library(
name = "llvm_instcount",
srcs = ["llvm_instcount.py"],
visibility = ["//visibility:public"],
deps = [
"//compiler_gym:compiler_env_state",
"//compiler_gym/bin:validate",
"//compiler_gym/envs",
"//compiler_gym/util",
],
)
17 changes: 17 additions & 0 deletions compiler_gym/leaderboard/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
"""This package contains modules that can be used for preparing leaderboard
submissions.

We provide `leaderboards
<https://github.com/facebookresearch/CompilerGym#leaderboards>`_ to track the
performance of user-submitted algorithms on compiler optimization tasks. The
goal of the leaderboards is to provide a venue for researchers to promote their
work, and to provide a common framework for evaluating and comparing different
approaches. We accept submissions to the leaderboards through pull requests, see
`here
<https://facebookresearch.github.io/CompilerGym/contributing.html#leaderboard-submissions>`_
for instructions.
"""
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,30 @@
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
"""This module defines a helper function for evaluating LLVM codesize reduction
policies.

Usage:

from compiler_gym.envs import LlvmEnv
from eval_policy import eval_policy

class MyLlvmCodesizePolicy:
def __call__(env: LlvmEnv) -> None:
pass # ...

if __name__ == "__main__":
eval_policy(MyLlvmCodesizePolicy())
"""LLVM is a popular open source compiler used widely in industry and research.
The :code:`llvm-ic-v0` environment exposes LLVM's optimizing passes as a set of
actions that can be applied to a particular program. The goal of the agent is to
select the sequence of optimizations that lead to the greatest reduction in
instruction count in the program being compiled. Reward is the reduction in
instruction count achieved scaled to the reduction achieved by LLVM's builtin
:code:`-Oz` pipeline.

+--------------------+------------------------------------------------------+
| Property | Value |
+====================+======================================================+
| Environment | :class:`LlvmEnv <compiler_gym.envs.LlvmEnv>`. |
+--------------------+------------------------------------------------------+
| Observation Space | Any. |
+--------------------+------------------------------------------------------+
| Reward Space | Instruction count reduction relative to :code:`-Oz`. |
+--------------------+------------------------------------------------------+
| Test Dataset | The 23 cBench benchmarks. |
+--------------------+------------------------------------------------------+

Users who wish to create a submission for this leaderboard may use
:func:`eval_llvm_instcount_policy()
<compiler_gym.leaderboard.llvm_instcount.eval_llvm_instcount_policy>` to
automatically evaluate their agent on the test set.
"""
import platform
import sys
Expand All @@ -33,15 +43,15 @@ def __call__(env: LlvmEnv) -> None:
from cpuinfo import get_cpu_info
from tqdm import tqdm

import compiler_gym # noqa Register environments.
from compiler_gym import CompilerEnvState
import compiler_gym.envs # noqa Register environments.
from compiler_gym.bin.validate import main as validate
from compiler_gym.compiler_env_state import CompilerEnvState
from compiler_gym.envs import LlvmEnv
from compiler_gym.util.tabulate import tabulate
from compiler_gym.util.timer import Timer

flags.DEFINE_string(
"logfile", "results.csv", "The path of the file to write results to."
"results_logfile", "results.csv", "The path of the file to write results to."
)
flags.DEFINE_string(
"hardware_info",
Expand All @@ -63,8 +73,8 @@ def __call__(env: LlvmEnv) -> None:
flags.DEFINE_boolean(
"resume",
False,
"If true, read the --logfile first and run only the policy evaluations not "
"already in the logfile.",
"If true, read the --results_logfile first and run only the policy "
"evaluations not already in the logfile.",
)
FLAGS = flags.FLAGS

Expand Down Expand Up @@ -132,7 +142,7 @@ def __init__(self, env, benchmarks, policy, print_header):
self.n = 0

def run(self):
with open(FLAGS.logfile, "a") as logfile:
with open(FLAGS.results_logfile, "a") as logfile:
for benchmark in self.benchmarks:
self.env.reset(benchmark=benchmark)
with Timer() as timer:
Expand Down Expand Up @@ -163,19 +173,95 @@ def run(self):
self.n += 1


def eval_policy(policy: Policy) -> None:
"""Evaluate a policy on a target dataset.
def eval_llvm_instcount_policy(policy: Policy) -> None:
"""Evaluate an LLVM codesize policy and generate results for a leaderboard
submission.

To use it, you define your policy as a function that takes an
:class:`LlvmEnv <compiler_gym.envs.LlvmEnv>` instance as input and modifies
it in place. For example, for a trivial random policy:

>>> from compiler_gym.envs import LlvmEnv
>>> def my_policy(env: LlvmEnv) -> None:
.... # Defines a policy that takes 10 random steps.
... for _ in range(10):
... _, _, done, _ = env.step(env.action_space.sample())
... if done: break

If your policy is stateful, you can use a class and override the
:code:`__call__()` method:

>>> class MyPolicy:
... def __init__(self):
... self.my_stateful_vars = {} # or similar
... def __call__(self, env: LlvmEnv) -> None:
... pass # ... do fun stuff!
>>> my_policy = MyPolicy()

The role of your policy is to perform a sequence of actions on the supplied
environment so as to maximize cumulative reward. By default, no observation
space is set on the environment, so :meth:`env.step()
<compiler_gym.envs.CompilerEnv.step>` will return :code:`None` for the
observation. You may set a new observation space:

>>> env.observation_space = "InstCount" # Set a new space for env.step()
>>> env.observation["InstCount"] # Calculate a one-off observation.

However, the policy may not change the reward space of the environment, or
the benchmark.

Once you have defined your policy, call the
:func:`eval_llvm_instcount_policy()
<compiler_gym.leaderboard.llvm_instcount.eval_llvm_instcount_policy>` helper
function, passing it your policy as its only argument:

>>> eval_llvm_instcount_policy(my_policy)

Put together as a complete example, a leaderboard submission script may look
like:

.. code-block:: python

# my_policy.py
from compiler_gym.leaderboard.llvm_instcount import eval_llvm_instcount_policy
from compiler_gym.envs import LlvmEnv

def my_policy(env: LlvmEnv) -> None:
env.observation_space = "InstCount" # we're going to use instcount space
pass # ... do fun stuff!

if __name__ == "__main__":
eval_llvm_instcount_policy(my_policy)

The :func:`eval_llvm_instcount_policy()
<compiler_gym.leaderboard.llvm_instcount.eval_llvm_instcount_policy>` helper
defines a number of commandline flags that can be overriden to control the
behavior of the evaluation. For example the flag :code:`--n` determines the
number of times the policy is run on each benchmark (default is 10), and
:code:`--results_logfile` determines the path of the generated results file:

.. code-block::

$ python my_policy.py --n=5 --results_logfile=my_policy_results.csv

You can use :code:`--helpfull` flag to list all of the flags that are
defined:

.. code-block::

$ python my_policy.py --helpfull

A policy is a function that takes as input an LlvmEnv environment and
performs a set of actions on it.
Once you are happy with your approach, see the `contributing guide
<https://github.com/facebookresearch/CompilerGym/blob/development/CONTRIBUTING.md#leaderboard-submissions>`_
for instructions on preparing a submission to the leaderboard.
"""

def main(argv):
assert len(argv) == 1, f"Unknown args: {argv[:1]}"
assert FLAGS.n > 0, "n must be > 0"

print(
f"Writing inference results to '{FLAGS.logfile}' and "
f"Writing inference results to '{FLAGS.results_logfile}' and "
f"hardware summary to '{FLAGS.hardware_info}'"
)

Expand All @@ -201,16 +287,16 @@ def main(argv):
# of benchmarks to evaluate.
print_header = True
init = 0
if Path(FLAGS.logfile).is_file():
if Path(FLAGS.results_logfile).is_file():
if FLAGS.resume:
with open(FLAGS.logfile, "r") as f:
with open(FLAGS.results_logfile, "r") as f:
for state in CompilerEnvState.read_csv_file(f):
if state.benchmark in benchmarks:
init += 1
benchmarks.remove(state.benchmark)
print_header = False
else:
Path(FLAGS.logfile).unlink()
Path(FLAGS.results_logfile).unlink()

# Run the benchmark loop in background so that we can asynchronously
# log progress.
Expand All @@ -226,6 +312,6 @@ def main(argv):

if FLAGS.validate:
FLAGS.env = "llvm-ic-v0"
validate(["argv0", FLAGS.logfile])
validate(["argv0", FLAGS.results_logfile])

app.run(main)
10 changes: 10 additions & 0 deletions docs/source/compiler_gym/leaderboard.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
compiler_gym.leaderboard
========================

.. automodule:: compiler_gym.leaderboard

LLVM Instruction Count
----------------------

.. automodule:: compiler_gym.leaderboard.llvm_instcount
:members:
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ for applying reinforcement learning to compiler optimizations.
compiler_gym/datasets
compiler_gym/envs
llvm/api
compiler_gym/leaderboard
compiler_gym/service
compiler_gym/spaces
compiler_gym/views
Expand Down
Loading