gridworld/q-learning_ex(146p)에서 epsilon이 잘못 설정 되어 있습니다. #5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
gridworld/q-learning_ex(146p) 코드를 보면
epsilon이 0.9로 설정되어있고, numpy.random.rand()가 epsilon보다 작을 때, 무작위 행동을 반환합니다.
즉, '큐함수에 의한 행동반환' : '무작위 행동반환'이 1:9로 이루어져서 너무 많은 탐색을 시도합니다.
앞의 예제들은 epsilon이 0.1로 되어있네요. 해당 부분을 수정해서 풀리퀘드립니다.
책으로 공부 잘 하고 있습니다. 감사합니다^_^