Request for a feature. Please introduce the TrialStatus.PAUSED #862

alxfed · 2022-03-22T14:51:14Z

In many cases when the trial data are collected from an online log (or a sequence of events coming in real time) and metrics are calculated over the periods of time of a different duration (like Retention D1 and Conversion D2 with 2 days for the first and 3 days for the second necessary for calculation) it is advantageous to collect data related to the same daily cohorts only. This can be done by starting a trial for a day, then pausing it until the metric with the longest duration can be calculated, then, after the intermediate evaluation - restarting the same trial.
It can also be used for collecting small samples from a large stream of data periodically, let's say once a day for 5 minutes, then assessing the resulting data together, as a single Trial.

Subclassing works of course, but it is just logical to have it in the framework itself.

danielcohenlive · 2022-03-22T16:39:29Z

Hi @alxfed, thanks for the feedback! I think it sounds reasonable to add TrialStatus.PAUSED, but just to be clear this is only a means to the end of having a row of data per day (or some other time period) per arm, right? Depending on your infrastructure, pausing may not be necessary for this. It could be possible to accomplished by creating a custom metric.

Or if it's not about having multiple rows but just being able to query specific time windows, fetch_trial_data() can accept kwargs, and if those kwargs are passed to fetch_data() they will be plumbed down to fetch_trial_data().

Can you explain your use case a little more and why it is that pausing allows you collect data for a specific time window? This sounds like a field experiment where data is being collected based on some sort of user interactions or non-deterministic events? If so, I would think the individual results could be recorded with a timestamp and separated by time window that way.

alxfed · 2022-03-22T20:29:24Z

The simplest way right now is to assign trial._properties={'state':"PAUSED"}, if you want my take on this subject. Saves and restores beautifully too
...but I was just saying that you equipped the preparatory stage of the experiment with very useful states CANDIDATE and STAGED (and multiple types of trial ending) but didn't do the similar job with the actual run in anticipation that somebody will be using your framework for real-time online experiments and will need this. That's all. I'm not seeking advice. Thank you.

Balandat · 2022-03-23T03:49:50Z

This can be done by starting a trial for a day, then pausing it until the metric with the longest duration can be calculated, then, after the intermediate evaluation - restarting the same trial.

Is there a reason you want to restart the same trial rather than run another trial of the same arm? In Ax, the concept of a particular configuration to evaluate is linked to an arm, rather than a trial. In particular, the same arm can be evaluated in different trials. Doing this could simplify your setup. Do you think this would work for you?

alxfed · 2022-03-23T12:01:41Z

Max, yes, there is a reason. My optimization config / Scalarized objective has these multiple metrics (with different durations and individual weights). I'm optimizing this aggregate as a whole and I assess the current state of the trial by the value of this scalar (in these intermediate evalutaions). It takes some time for a longer metric to be available but if they (metrics) will cover different number of cohorts there will be a bias; and on top of that the stream of data is not stationary, the later cohorts counted in the shorter metrics spoil the covariations that exist (and are visible) in the first cohort alone.
Yes, I understand your vision of tying everything to an Arm/parameterization. What you are suggesting makes sense, but the non-stationarity (forgive me my 'French') again will spoil the resulting distribution related to this 'same' arm and for sure will kill the (multiple, not just pair) covariations.
Sorry for bothering you with this, but I'm sure other people struggling with real time experiments will bump into these problems too.

lena-kashtelyan · 2022-04-01T14:35:11Z

Hi @alxfed, thank you for the useful suggestion! We'll put it on our wishlist for now, but in the meantime it seems you have a simple workaround of writing to trial properties, so that's great.

lena-kashtelyan self-assigned this Apr 1, 2022

lena-kashtelyan added the wishlist Long-term wishlist feature requests label Apr 1, 2022

lena-kashtelyan mentioned this issue Apr 1, 2022

Wishlist: Tracking Issue #566

Open

lena-kashtelyan closed this as completed Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for a feature. Please introduce the TrialStatus.PAUSED #862

Request for a feature. Please introduce the TrialStatus.PAUSED #862

alxfed commented Mar 22, 2022 •

edited

Loading

danielcohenlive commented Mar 22, 2022

alxfed commented Mar 22, 2022 •

edited

Loading

Balandat commented Mar 23, 2022

alxfed commented Mar 23, 2022 •

edited

Loading

lena-kashtelyan commented Apr 1, 2022

Request for a feature. Please introduce the TrialStatus.PAUSED #862

Request for a feature. Please introduce the TrialStatus.PAUSED #862

Comments

alxfed commented Mar 22, 2022 • edited Loading

danielcohenlive commented Mar 22, 2022

alxfed commented Mar 22, 2022 • edited Loading

Balandat commented Mar 23, 2022

alxfed commented Mar 23, 2022 • edited Loading

lena-kashtelyan commented Apr 1, 2022

alxfed commented Mar 22, 2022 •

edited

Loading

alxfed commented Mar 22, 2022 •

edited

Loading

alxfed commented Mar 23, 2022 •

edited

Loading