Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PyMC3 3.9 dims, Model(coords) arguments from the Model context manager instead of manually adding dims for az.from_pymc3? #1250

Closed
StanczakDominik opened this issue Jun 19, 2020 · 3 comments

Comments

@StanczakDominik
Copy link
Contributor

StanczakDominik commented Jun 19, 2020

Short Description

I'm tinkering with the awesome new dims functionality in PyMC3, but when trying to use it in InferenceData, I run into having to write very verbose code...

Code example

I have this example of Starcraft data analysis (shameless self-plug :D):

Setup, for reproducibility's sake
import pymc3 as pm
import arviz as az
import pandas as pd

def MMR_winrate(diff):
    return 1 / (1 + 10**(-diff/880))

df = pd.read_csv("https://raw.githubusercontent.com/StanczakDominik/stanczakdominik.github.io/src/files/replays.csv", index_col=0)
df['time_played_at'] = pd.to_datetime(df.time_played_at)
df = df.sort_values('time_played_at')
for column in ['race', 'enemy_race', 'map_name']:
    df[column] = pd.Categorical(df[column])
df['enemy_mmr'] = df['mmr'] - df['mmr_diff']
df['expected_winrate'] = MMR_winrate(df.mmr_diff)
data = df[(df.mmr > 0) & (df.enemy_mmr > 0) & (df.race == "Protoss") & (df.duration > 10)]
coords = {
    "replay": data.index,
    "race": data.enemy_race.unique().astype("str"),
}

race_encoding ={"Terran": 0,
                "Protoss": 1,
                "Zerg": 2} 

with pm.Model(coords=coords) as data_model:
    wins_losses = pm.Data("win_loss", data.win, dims='replay')
    enemy_races = pm.Data("enemy_race", data.enemy_race.map(race_encoding).astype(int), dims='replay')
    enemy_mmr = pm.Data("enemy_mmr", data.enemy_mmr, dims='replay')
    
    mmr_μ_matchup = pm.Normal('μ', 4000, 300, dims='race')
    mmr_σ_matchup = pm.HalfNormal('σ', 300, dims='race')
    mmr_σ_norm = pm.Normal('helper', 0, 1, dims='replay')
    
    mmr = pm.Deterministic('MMR', mmr_μ_matchup[enemy_races] + mmr_σ_matchup[enemy_races] * mmr_σ_norm)
    
    diffs = pm.Deterministic('MMR_diff', mmr - enemy_mmr)
    p = pm.Deterministic('winrate', MMR_winrate(diffs))
    wl = pm.Bernoulli('win', p=p, observed=wins_losses)

    trace = pm.sample()
    output = az.from_pymc3(trace=trace,
                           coords=coords,
						   # CONCERNING PART STARTS HERE
                           dims = {
                               "win": ["replay"],
                               "μ": ["race"],
                               "σ": ["race"],
                               "MMR": ["replay"],
                               "MMR_diff": ["replay"],
                               "helper": ["replay"],
                               "winrate": ["replay"],
                               "win_loss": ["replay"],
                               "enemy_race": ["replay"],
                               "enemy_mmr": ["replay"],
                           }
                          )

If I don't add the dims = {CONCERNINGLY LONG AND REPETITIVE DICTIONARY} part, I get each variable with its own separate coord, say, μ_dim_0.

Since az.from_pymc3 already pulls the model from the context manager, might it not also be able to pull dims from there? I see data_model has a coords dict, but not a dims one - if one were to be added in PyMC3, would it make sense to grab it and use it here?

I'd be interested in contributing that (haven't done a two-repo PR before! 😆), but would appreciate some initial pointers on whether all of that is reasonable :)

I'm running ArviZ 0.8.3, with PyMC3 3.9.1.

I'd expect this would semi-solve the corresponding pm.sample(return_inferencedata=True) as well.

@StanczakDominik StanczakDominik changed the title Use PyMC3 3.9 dims, Model(coords) arguments from the Model context manager? Use PyMC3 3.9 dims, Model(coords) arguments from the Model context manager instead of manually adding dims for az.from_pymc3? Jun 19, 2020
@StanczakDominik
Copy link
Contributor Author

I'm not sure if this wouldn't be a bit related to pymc-devs/pymc#3953 ?

@OriolAbril
Copy link
Member

Hi, thanks for the report!

This issue should have been fixed in #1240, it's definitely a bug. Can you confirm everything is working on ArviZ latest dev version? I have been testing it on a couple models but I'll probably have still missed some edgecases :)

@StanczakDominik
Copy link
Contributor Author

Always nice to see a negative bugfix ETA 😆 That completely fixed it. Thanks a ton!

I think it might be a good idea to get 0.8.4 out, in that case, for compatibility with the new sample functionality - that works too! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants