setting uncertainties #7

rcrehuet · 2014-10-15T10:22:19Z

I have some confusion with how uncertainties work.
I would expect that the values obtained from p = belt_model.accumulate_populations() have uncertainties close to the ones that I have setup. However the RMS difference with the measurements is much smaller.
I guess the reason is that the error of each individual realization is close to the uncertainties. If I do:

p_vals = np.asarray(list(belt_model.iterate_populations()))

then each of the p_vals has RMS differences close to uncertainties. But then as p is an average of these values, the standard error of this mean is much smaller.
Despite I understand this behaviour, I still think that the uncertainties should affect p as we are otherwise overfitting. In other words, gentler reweightings could result in p's that are still within the experimental error. With that in mind, I tried playing with regularization_strength without sucess.
To clarify this, here is an example.

In blue you see some realizations of p_vals, in red the measurements and in green the average of p_vals

The text was updated successfully, but these errors were encountered:

kyleabeauchamp · 2014-10-15T14:07:16Z

So accumulate_populations will give you conformational populations averaged over the MCMC run--no uncertainties attached.

I don't think it's overfitting so much as just the fact that you are averaging over the noise in accumulate_populations. If you want uncertainties, you should definitely look at the full range of populations sampled.

Finally, in many cases it's not the populations themselves that are of interest, but some observable that is a weighted average of populations. In that case, the most relevant thing to do is

Calculate observables for each frame
Iterate populations and calculate observable for each population MCMC sample
Look at the resulting posterior distribution of your observable of interest

rcrehuet · 2014-10-15T15:11:21Z

Thanks for your answer. In an attempt to shorten my question I oversimplified. Indeed, the figure attached was exactly what you suggest to do. Those values were the observables for different MCMC samples and the experimental values.
There is indeed a distribution of values for the weights and therefore for the observables, but the average value is still too good. By that, I mean that I would prefer a smaller reweighting at a cost of a larger discrepancy (of the order of the exerimental error) with the experimental value.
I guess this should be obtained with regularization_strength but I don't see a way to forecast how strong it should be in relation to the experimental error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

setting uncertainties #7

setting uncertainties #7

rcrehuet commented Oct 15, 2014

kyleabeauchamp commented Oct 15, 2014

rcrehuet commented Oct 15, 2014

setting uncertainties #7

setting uncertainties #7

Comments

rcrehuet commented Oct 15, 2014

kyleabeauchamp commented Oct 15, 2014

rcrehuet commented Oct 15, 2014