Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setting uncertainties #7

Open
rcrehuet opened this issue Oct 15, 2014 · 2 comments
Open

setting uncertainties #7

rcrehuet opened this issue Oct 15, 2014 · 2 comments

Comments

@rcrehuet
Copy link
Contributor

I have some confusion with how uncertainties work.
I would expect that the values obtained from p = belt_model.accumulate_populations() have uncertainties close to the ones that I have setup. However the RMS difference with the measurements is much smaller.
I guess the reason is that the error of each individual realization is close to the uncertainties. If I do:

p_vals = np.asarray(list(belt_model.iterate_populations()))

then each of the p_vals has RMS differences close to uncertainties. But then as p is an average of these values, the standard error of this mean is much smaller.
Despite I understand this behaviour, I still think that the uncertainties should affect p as we are otherwise overfitting. In other words, gentler reweightings could result in p's that are still within the experimental error. With that in mind, I tried playing with regularization_strength without sucess.
To clarify this, here is an example.
index

In blue you see some realizations of p_vals, in red the measurements and in green the average of p_vals

@kyleabeauchamp
Copy link
Owner

So accumulate_populations will give you conformational populations averaged over the MCMC run--no uncertainties attached.

I don't think it's overfitting so much as just the fact that you are averaging over the noise in accumulate_populations. If you want uncertainties, you should definitely look at the full range of populations sampled.

Finally, in many cases it's not the populations themselves that are of interest, but some observable that is a weighted average of populations. In that case, the most relevant thing to do is

  1. Calculate observables for each frame
  2. Iterate populations and calculate observable for each population MCMC sample
  3. Look at the resulting posterior distribution of your observable of interest

@rcrehuet
Copy link
Contributor Author

Thanks for your answer. In an attempt to shorten my question I oversimplified. Indeed, the figure attached was exactly what you suggest to do. Those values were the observables for different MCMC samples and the experimental values.
There is indeed a distribution of values for the weights and therefore for the observables, but the average value is still too good. By that, I mean that I would prefer a smaller reweighting at a cost of a larger discrepancy (of the order of the exerimental error) with the experimental value.
I guess this should be obtained with regularization_strength but I don't see a way to forecast how strong it should be in relation to the experimental error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants