-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement optimality tightening #60
Comments
I gave it a shot, however I am not sure how the discounted reward R is supposed to be used and I also need to check if future and past k-transitions are valid https://github.com/petrosgk/Atari/tree/opt-tightening |
Awesome - I'll try and have a look soon or next week! Would you be able to test it to try and replicate one of the results from the paper? I started on this myself as well, so will see how our implementations compare. |
Hi, have you reproduced that optimality tightening results? I have tried some games based on tensorflow and openai gym but the results seem much worse than the papers' results. I am not sure whether I misunderstand something or miss some tricks in the paper. It seems that the paper doesn't include everything about their works. |
Does anyone know wether they have published the source code for optimal tightening, from the paper? |
No, they haven't published their code as far as I know. The tricks they use are not hard to implement but I can not still achieve their performance. |
I have tried implementing optimality tightening (see earlier post) but the results I get are also much worse than the paper's. |
In my experience the smallest details in a paper can be key to reproducing results - and these may be missing or ambiguous. If anyone is reasonably confident in their implementation, you should try contacting one of the authors with specific questions. |
Hi guys, Best, |
Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening potentially speeds up Q-learning by an order of magnitude! Apparently not too hard to implement either.
The text was updated successfully, but these errors were encountered: