-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llvm_instcount] Leaderboard Submission: DQN trained on test set #292
[llvm_instcount] Leaderboard Submission: DQN trained on test set #292
Conversation
…csv file along with writeup to leaderboard/llvm_instcount/dqn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @phesse001, nice submission!
Could you add a little bit of an explanation to the "What range of values were considered for the above parameters?" section of the write up to describe how you arrived at those values? E.g. was it by trial and error, what was the experimental setup for each "trial". Just a sentence or two would be enough :)
Other than that, looks great. A few minor nitpick comments left inline.
I have reproduced your results locally, so this is me signing off on your results! Once you've had a chance to take a look at those small things I'd say this is looking good to merge.
$ python -m compiler_gym.bin.validate --env=llvm-ic-v0 CompilerGym/results.csv
✅ cbench-v1/rijndael 1.1027
✅ cbench-v1/adpcm 1.0084
✅ cbench-v1/adpcm 1.0084
✅ cbench-v1/adpcm 1.0084
✅ cbench-v1/ispell 0.9887
✅ cbench-v1/rijndael 1.1027
✅ cbench-v1/rijndael 1.1027
...
✅ cbench-v1/tiffmedian 1.0324
----------------------------------------------------
Number of validated results: 230 of 230
Mean walltime per benchmark: 0.367s (std: 0.781s)
Geometric mean IrInstructionCountOz: 1.029 (std: 0.112)
Cheers,
Chris
Co-authored-by: Chris Cummins <[email protected]>
…t commit rather than branch Co-authored-by: Chris Cummins <[email protected]>
LGTM, thanks @phesse001! |
This adds the results of training a dqn on the benchmarks from the test set (cbench-v1) for 4000 episodes using the llvm-ic-v0 environment with InstCountNorm as the observation passed to the dqn.
The trained model was then saved and loaded for evaluation of the policy. Since this is not a general policy, I timed the training process with the
time
command and got a total of 2085.99s. I averaged this over the 23 benchmarks, getting ~90.700s per benchmark, and added the average walltime of .318s from eval_llvm_instcount_policy() to get a total of 91.018s walltime(mean).My code for the dqn can be found here
What is in this PR?