From a65e80b57180f5bf92cb4a0c5a064eeda79174c9 Mon Sep 17 00:00:00 2001 From: David <119470903+PubliusAu@users.noreply.github.com> Date: Thu, 25 May 2023 17:25:39 -0500 Subject: [PATCH] Add talk on RLHF with InstructGTP researchers from OpenAI --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 7ca9722..0e97856 100644 --- a/README.md +++ b/README.md @@ -94,6 +94,7 @@ Please feel free to [pull requests](https://github.com/aikorea/awesome-rl/pulls) - [The Bellman Equations, Dynamic Programming, and Generalized Policy Iteration](https://youtu.be/_j6pvGEchWU) - [Monte Carlo And Off-Policy Methods](https://youtu.be/bpUszPiWM7o) - [TD Learning, Sarsa, and Q-Learning](https://youtu.be/AJiG3ykOxmY) + - [OpenAI Researchers on Reinforcement Learning with Human Feedback](https://www.youtube.com/watch?v=RkFS6-GwCxE) ### Books - Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction (1st Edition, 1998) [[Book]](http://incompleteideas.net/book/ebook/the-book.html) [[Code]](http://incompleteideas.net/book/code/code.html)