You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The types in built-in policies and algorithms like QBasedPolicy and TDLearner are overly specific and prevent users from using the existing code to extend to new algorithms. Rather, it forces users to rewrite large chunks of code.
For example, QBasedPolicy is defined as struct QBasedPolicy{L<:TDLearner,E<:AbstractExplorer} <: AbstractPolicy and all the methods for it similarly. Therefore, I cannot write a new learner and use it in a QBasedPolicy, even though all the methods for it seem to be very general.
Another example is TDLearner which is defined as Base.@kwdef mutable struct TDLearner{M,A} <: AbstractLearner where {A<:TabularApproximator,M<:Symbol}. However, the constructor for it only allows M=:SARS. This makes me have to rewrite the whole struct if I want to write a new TD learning algorithm or if I want to use a different kind of approximation (e.g linear).
In my opinion, these restrictions should be removed and they should be replaced with general types such as AbstractLearner.
The text was updated successfully, but these errors were encountered:
Thanks for your feedback and clear explanation of your concerns.
Two points:
Looking at the code for q_based_policy.jl, there is no reason it can only support TDLearner, so please feel free to submit an appropriate pull request and I’ll review it. :)
TDLearner is not an abstract type, and ensuring it supports arbitrary approximations is beyond scope. That said, earlier versions of TDLearner supported algorithms beyond :SARS and supported LinearApproximators here and here; it would be great to have these re-included in this package ecosystem, the best place at the present would be in the ReinforcmentLearningFarm package (part of this repo). If you want to pursue adding them back, let me know and I can give you some pointers.
The types in built-in policies and algorithms like
QBasedPolicy
andTDLearner
are overly specific and prevent users from using the existing code to extend to new algorithms. Rather, it forces users to rewrite large chunks of code.For example,
QBasedPolicy
is defined asstruct QBasedPolicy{L<:TDLearner,E<:AbstractExplorer} <: AbstractPolicy
and all the methods for it similarly. Therefore, I cannot write a new learner and use it in aQBasedPolicy
, even though all the methods for it seem to be very general.Another example is
TDLearner
which is defined asBase.@kwdef mutable struct TDLearner{M,A} <: AbstractLearner where {A<:TabularApproximator,M<:Symbol}
. However, the constructor for it only allowsM=:SARS
. This makes me have to rewrite the whole struct if I want to write a new TD learning algorithm or if I want to use a different kind of approximation (e.g linear).In my opinion, these restrictions should be removed and they should be replaced with general types such as
AbstractLearner
.The text was updated successfully, but these errors were encountered: