Skip to content

Eigen Memory Trees (EMT)

Mark Rucker edited this page Dec 9, 2022 · 10 revisions

An Eigen Memory Tree (EMT) is a memory based learning reduction that works on examples with simple label types.

EMTs will remember previous training examples and use this memory to assign labels to future requested predictions.

The original EMT paper can be found here.

Experiments

Python based contextual bandit experiments using EMT can be found at: https://github.com/mrucker/emt_experiments. These experiments create contextual bandit learners from an EMT by remembering the rewards received when taking an action in a given context. At prediction time the EMT is queried for every available context action pair. The context action pair that EMT predicts will receive the highest reward becomes the greedy action.

Usage

vw --eigen_memory_tree --emt_tree <max memories> --emt_leaf <max leaf size> --emt_scorer <scorer type> --emt_router <router type>

The first argument, <max memories>, indicates the number of memories the tree is allowed to have before memories begin to be pruned. This is an optional argument whose default value, 0, means that the an EMT tree will never forget memories. The second argument, <max leaf size>, indicates how many memories a leaf in the tree can have before a leaf will be split into two smaller leaves. Leaf size is a trade-off parameter. In general larger leaf sizes lead to more accurate predictions at the cost of more compute time. The last two flags <scorer type> and <router type> control how the tree is searched.

Label Type

EMT works with simple label types:

[Label] [Importance] [Base]

  • [Label] should be a scalar value.
  • [Importance] determines how large of an update step to take.
  • [Base] is unused by EMT. EMT calculates its own base internally.

Small Performance Suggestions

EMT also works with several existing VW flags due to it reducing to a regression learner. We have found the following flags can improve performance.

--min_prediction 0 --max_prediction 3 --interactions <polymomial namespace interactions>

Due to the mechanism used by EMT to compare memories bounding the min and max prediction used by the reduction learner can greatly improve performance. Additionally, adding a few simple non-linear namespace interactions can also improve the performance of the scorer reduction learner. None of these flags are necessary, but if you are not getting desired results from EMT they are easy to test.

Clone this wiki locally