Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overspecific types in built-in policies and algorithms #1066

Open
dharux opened this issue Apr 10, 2024 · 2 comments
Open

Overspecific types in built-in policies and algorithms #1066

dharux opened this issue Apr 10, 2024 · 2 comments

Comments

@dharux
Copy link
Contributor

dharux commented Apr 10, 2024

The types in built-in policies and algorithms like QBasedPolicy and TDLearner are overly specific and prevent users from using the existing code to extend to new algorithms. Rather, it forces users to rewrite large chunks of code.

For example, QBasedPolicy is defined as struct QBasedPolicy{L<:TDLearner,E<:AbstractExplorer} <: AbstractPolicy and all the methods for it similarly. Therefore, I cannot write a new learner and use it in a QBasedPolicy, even though all the methods for it seem to be very general.

Another example is TDLearner which is defined as Base.@kwdef mutable struct TDLearner{M,A} <: AbstractLearner where {A<:TabularApproximator,M<:Symbol}. However, the constructor for it only allows M=:SARS. This makes me have to rewrite the whole struct if I want to write a new TD learning algorithm or if I want to use a different kind of approximation (e.g linear).

In my opinion, these restrictions should be removed and they should be replaced with general types such as AbstractLearner.

@jeremiahpslewis
Copy link
Member

jeremiahpslewis commented Apr 10, 2024

Hi @dharux,

Thanks for your feedback and clear explanation of your concerns.

Two points:

  1. Looking at the code for q_based_policy.jl, there is no reason it can only support TDLearner, so please feel free to submit an appropriate pull request and I’ll review it. :)

  2. TDLearner is not an abstract type, and ensuring it supports arbitrary approximations is beyond scope. That said, earlier versions of TDLearner supported algorithms beyond :SARS and supported LinearApproximators here and here; it would be great to have these re-included in this package ecosystem, the best place at the present would be in the ReinforcmentLearningFarm package (part of this repo). If you want to pursue adding them back, let me know and I can give you some pointers.

@dharux
Copy link
Contributor Author

dharux commented Apr 17, 2024

Thanks for the reply.

  1. I have submitted a pull request for this issue.

  2. I would be interested in adding the algorithms back into the package. Some guidelines would definitely be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants