Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed mixup between conf and modelMCMCConf #1298

Merged
merged 2 commits into from
Apr 20, 2023
Merged

fixed mixup between conf and modelMCMCConf #1298

merged 2 commits into from
Apr 20, 2023

Conversation

danielturek
Copy link
Member

Fixes a mixup between the conf and the modelMCMCConf variables, which was introduced in PR #1068.

@paciorek
Copy link
Contributor

How about adding a numerical test for the default MSE loss? I think the lack of that is why this slipped through.

@danielturek
Copy link
Member Author

Updated the test using the dyes model (which was added in the faulty bug fix #1068) to match patched results.

Also added a test (again using the dyes model) using the default "MSE" loss function.

@danielturek
Copy link
Member Author

@paciorek Second set of eyes welcome on this.

@paciorek
Copy link
Contributor

It seems weird the loss for dyes would be larger (97.9) without the bug than with it (63.9). My (perhaps incomplete) understanding of the effect of the bug was that it would compute the loss sampling only the held-out data nodes and not the model parameters hence conditioning on the inits for parameters, and I would think that should lead to a higher loss than when actually doing MCMC on the parameters.

Any thoughts?

@danielturek
Copy link
Member Author

@paciorek This took me a few minutes of investigation, but it does make sense. Your high-level instinct is right, that by fixing the model parameters at their initial values (the buggy behaviour), we would perhaps expect a higher CV value. However, the initial values use 1 as the precision & standard deviation values, which when fixed at this value constraints sampled values of the predictive nodes (originally data nodes, which were set to non-data via a CV fold) to a very tight region. So although that region is centered around a less-than-optimal (initial) value, it's centered there very tightly, leading to an overall lower CV value. In contrast, the correct (fixed) case has the sampled values of the predictive nodes (the CV fold nodes) centered around a better value, but it uses the posterior sampled value of the standard deviation term, thus the spread of the samples of the predictive nodes is much larger, leaving to a larger CV value. That's what leads to the larger CV value in the fixed case.

@paciorek
Copy link
Contributor

Ok, thanks for investigating. I'm still surprised that the fixed values are as good as they are (perhaps they are reasonable by accident), but as long as you're satisfied then ok to proceed by me.

@danielturek danielturek merged commit b1dd02e into devel Apr 20, 2023
@danielturek danielturek deleted the CV_conf_mixup branch April 20, 2023 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants