I have used DDPG+HER to train FetchReach-V1. After training for 5 hours, its accuracy is around 90% to 95%.
column = [" Episode_no ", " reward for the current episode ", " average reward ", " Max_reward "]
Average is taken over last 10 observation.
Every Episode contains 200 actions to reach the goal. You would receive a reward of 0 if you reached the goal else 0. So, max and min reward for an episode is 0 and -200.