You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The second line above will always return 0 because the key (value_A_Changed, value_B_Changed) does not exist in temp
I tried rerunning it with this change and could not reproduce the answer of the book. I am attaching the optimal policy map that I got
The text was updated successfully, but these errors were encountered:
I also have a question about the original car rental company problem, what if location A only has 2 cars, but a random policy asks to move 5 cars from location A to B?
In your python file Ex4.7-A.py line 51 I think it should read
temp[((value_A_Changed, value_B_Changed),reward)] = temp.get( ((value_A_Changed, value_B_Changed),reward), 0 )
instead of
temp[((value_A_Changed, value_B_Changed),reward)] = temp.get( (value_A_Changed, value_B_Changed), 0 )
The second line above will always return 0 because the key
(value_A_Changed, value_B_Changed)
does not exist intemp
I tried rerunning it with this change and could not reproduce the answer of the book. I am attaching the optimal policy map that I got
The text was updated successfully, but these errors were encountered: