Use PyMC3 3.9 `dims`, `Model(coords)` arguments from the `Model` context manager instead of manually adding `dims` for `az.from_pymc3`? #1250

StanczakDominik · 2020-06-19T15:04:25Z

Short Description

I'm tinkering with the awesome new dims functionality in PyMC3, but when trying to use it in InferenceData, I run into having to write very verbose code...

Code example

I have this example of Starcraft data analysis (shameless self-plug :D):

Setup, for reproducibility's sake

import pymc3 as pm
import arviz as az
import pandas as pd

def MMR_winrate(diff):
    return 1 / (1 + 10**(-diff/880))

df = pd.read_csv("https://raw.githubusercontent.com/StanczakDominik/stanczakdominik.github.io/src/files/replays.csv", index_col=0)
df['time_played_at'] = pd.to_datetime(df.time_played_at)
df = df.sort_values('time_played_at')
for column in ['race', 'enemy_race', 'map_name']:
    df[column] = pd.Categorical(df[column])
df['enemy_mmr'] = df['mmr'] - df['mmr_diff']
df['expected_winrate'] = MMR_winrate(df.mmr_diff)
data = df[(df.mmr > 0) & (df.enemy_mmr > 0) & (df.race == "Protoss") & (df.duration > 10)]

coords = {
    "replay": data.index,
    "race": data.enemy_race.unique().astype("str"),
}

race_encoding ={"Terran": 0,
                "Protoss": 1,
                "Zerg": 2} 

with pm.Model(coords=coords) as data_model:
    wins_losses = pm.Data("win_loss", data.win, dims='replay')
    enemy_races = pm.Data("enemy_race", data.enemy_race.map(race_encoding).astype(int), dims='replay')
    enemy_mmr = pm.Data("enemy_mmr", data.enemy_mmr, dims='replay')
    
    mmr_μ_matchup = pm.Normal('μ', 4000, 300, dims='race')
    mmr_σ_matchup = pm.HalfNormal('σ', 300, dims='race')
    mmr_σ_norm = pm.Normal('helper', 0, 1, dims='replay')
    
    mmr = pm.Deterministic('MMR', mmr_μ_matchup[enemy_races] + mmr_σ_matchup[enemy_races] * mmr_σ_norm)
    
    diffs = pm.Deterministic('MMR_diff', mmr - enemy_mmr)
    p = pm.Deterministic('winrate', MMR_winrate(diffs))
    wl = pm.Bernoulli('win', p=p, observed=wins_losses)

    trace = pm.sample()
    output = az.from_pymc3(trace=trace,
                           coords=coords,
						   # CONCERNING PART STARTS HERE
                           dims = {
                               "win": ["replay"],
                               "μ": ["race"],
                               "σ": ["race"],
                               "MMR": ["replay"],
                               "MMR_diff": ["replay"],
                               "helper": ["replay"],
                               "winrate": ["replay"],
                               "win_loss": ["replay"],
                               "enemy_race": ["replay"],
                               "enemy_mmr": ["replay"],
                           }
                          )

If I don't add the dims = {CONCERNINGLY LONG AND REPETITIVE DICTIONARY} part, I get each variable with its own separate coord, say, μ_dim_0.

Since az.from_pymc3 already pulls the model from the context manager, might it not also be able to pull dims from there? I see data_model has a coords dict, but not a dims one - if one were to be added in PyMC3, would it make sense to grab it and use it here?

I'd be interested in contributing that (haven't done a two-repo PR before! 😆), but would appreciate some initial pointers on whether all of that is reasonable :)

I'm running ArviZ 0.8.3, with PyMC3 3.9.1.

I'd expect this would semi-solve the corresponding pm.sample(return_inferencedata=True) as well.

The text was updated successfully, but these errors were encountered:

StanczakDominik · 2020-06-19T15:45:00Z

I'm not sure if this wouldn't be a bit related to pymc-devs/pymc#3953 ?

OriolAbril · 2020-06-19T16:40:21Z

Hi, thanks for the report!

This issue should have been fixed in #1240, it's definitely a bug. Can you confirm everything is working on ArviZ latest dev version? I have been testing it on a couple models but I'll probably have still missed some edgecases :)

StanczakDominik · 2020-06-19T17:06:01Z

Always nice to see a negative bugfix ETA 😆 That completely fixed it. Thanks a ton!

I think it might be a good idea to get 0.8.4 out, in that case, for compatibility with the new sample functionality - that works too! :)

StanczakDominik changed the title ~~Use PyMC3 3.9 dims, Model(coords) arguments from the Model context manager?~~ Use PyMC3 3.9 dims, Model(coords) arguments from the Model context manager instead of manually adding dims for az.from_pymc3? Jun 19, 2020

StanczakDominik closed this as completed Jun 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use PyMC3 3.9 `dims`, `Model(coords)` arguments from the `Model` context manager instead of manually adding `dims` for `az.from_pymc3`? #1250

Use PyMC3 3.9 `dims`, `Model(coords)` arguments from the `Model` context manager instead of manually adding `dims` for `az.from_pymc3`? #1250

StanczakDominik commented Jun 19, 2020 •

edited

Loading

StanczakDominik commented Jun 19, 2020

OriolAbril commented Jun 19, 2020

StanczakDominik commented Jun 19, 2020

Use PyMC3 3.9 dims, Model(coords) arguments from the Model context manager instead of manually adding dims for az.from_pymc3? #1250

Use PyMC3 3.9 dims, Model(coords) arguments from the Model context manager instead of manually adding dims for az.from_pymc3? #1250

Comments

StanczakDominik commented Jun 19, 2020 • edited Loading

Short Description

Code example

StanczakDominik commented Jun 19, 2020

OriolAbril commented Jun 19, 2020

StanczakDominik commented Jun 19, 2020

Use PyMC3 3.9 `dims`, `Model(coords)` arguments from the `Model` context manager instead of manually adding `dims` for `az.from_pymc3`? #1250

Use PyMC3 3.9 `dims`, `Model(coords)` arguments from the `Model` context manager instead of manually adding `dims` for `az.from_pymc3`? #1250

StanczakDominik commented Jun 19, 2020 •

edited

Loading