A different implementation for car rental problem #166

kevinnewgame · 2024-09-13T05:22:23Z

By saving probability models in numpy array, the DP algorithm could be very efficient without manually parallel computing.
The running time is short and it's about 2 seconds to generate the plots in the book.
The idea is to separate reward and value in Bellman equation and calculate their expected terms separately. Since we have the model in DP problem, we can save the expected reward and probability of state transitions before policy updating.

By saving probability models in numpy array, the DP algorithm could be very efficient without manually parallel computing.

A different implementation for car rental problem

c0b1628

By saving probability models in numpy array, the DP algorithm could be very efficient without manually parallel computing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A different implementation for car rental problem #166

A different implementation for car rental problem #166

kevinnewgame commented Sep 13, 2024

A different implementation for car rental problem #166

Are you sure you want to change the base?

A different implementation for car rental problem #166

Conversation

kevinnewgame commented Sep 13, 2024