Feature Request: Provide a confidence measure associated with action #60

aidancc · 2016-08-03T23:48:06Z

For either policy or ranking usage, it would be helpful to know the confidence associated with any particular suggested action. For example, it would be helpful to be able to distinguish between the following 2 cases:
a) the probability of achieving maximal reward from the first-ranked action >> probability of achieving maximal reward from the second-ranked action
b) the probability of achieving maximal reward from the first-ranked action =~ probability of achieving maximal reward from the second-ranked action
Furthermore, it would be useful to be able to know if a particular suggested action is skewed toward exploration and/or have the ability to prevent this on a per-request basis.

JohnLangford · 2016-08-09T20:16:58Z

This is tricky, because you are asking for high precision at the
frontier of what can be estimated. There are some exploration strategies
(i.e. epsilon greedy) where we could estimate this reasonably well. But
if you care, then you should already be using some of the advanced
exploration strategies (i.e. bagging or cover). When you are using
these advanced strategies, there generally isn't any (significant)
probability of exploring obviously suboptimal actions (as determined by
the algorithm) so everything is already in case (b).

-John

On 08/03/2016 07:48 PM, aidancc wrote:

For either policy or ranking usage, it would be helpful to know the
confidence associated with any particular suggested action. For
example, it would be helpful to be able to distinguish between the
following 2 cases:
a) the probability of achieving maximal reward from the first-ranked
action >> probability of achieving maximal reward from the
second-ranked action
b) the probability of achieving maximal reward from the first-ranked
action =~ probability of achieving maximal reward from the
second-ranked action
Furthermore, it would be useful to be able to know if a particular
suggested action is skewed toward exploration and/or have the ability
to prevent this on a per-request basis.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#60, or mute the thread
https://github.com/notifications/unsubscribe-auth/AAE25m2qYZSyYgzcuWHhJq1TyutAK-u0ks5qcSi2gaJpZM4JcNT4.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Provide a confidence measure associated with action #60

Feature Request: Provide a confidence measure associated with action #60

aidancc commented Aug 3, 2016

JohnLangford commented Aug 9, 2016

Feature Request: Provide a confidence measure associated with action #60

Feature Request: Provide a confidence measure associated with action #60

Comments

aidancc commented Aug 3, 2016

JohnLangford commented Aug 9, 2016