[Template Fitting] Is it ok to have large amount of empty bins in the template? #825
-
Aloha iminuit community and experts, I am a student working on R(D) and R(D*) measurements at Belle II. The current plan of signal extraction is through a 2d template fitting ( These 2d histograms are flattened and then used as templates. However, as seen in the picture, the 2 fitting quantities ( So I tested this idea by fitting templates to an independently produced MC sample (test sample). My code snippet and fitting results with and without empty bins are attached below. The empty bins are defined in the test sample.
Above is the result with empty bins and below is without. The reduced (ps. I don't know why the |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 19 replies
-
tl;dr: You should keep the empty bins and it makes sense that the fit with empty bins included gives you smaller uncertaintes. Empty bins in the data are completely ok, no need to cut them. Problematic are empty bins in the templates, more precisely, situations in which all templates have zero entries in some bin but the data bin is non-zero. Such bins have to be discarded in the likelihood function, because it is not possible to draw any information from them (all predictions happen to be empty, so there is no way to estimate the amplitudes). If at least one template is filled for a given bin and the data bin is empty, the Poisson-based template fit can draw information from that. For the Poisson distribution we can compute the probability to observe nothing for a given expected value. This is why your fit with empty bins included gives you more precise estimates for the yields. |
Beta Was this translation helpful? Give feedback.
-
Yes, you need to set limits. If you don't do that, the fit may chose a combination of amplitudes where one or more bins get an expected count that is negative, which is mathematically not possible. It is your responsibility to set the parameter limits so that this never happens. This tutorial explains the problem in more detail. Unless your are in the rase case where you need to fit an interference pattern, the amplitudes of components are always positive. In a future version of iminuit, limits for the amplitudes may be set automatically by default, so that you have to explicitly unset them if you really want negative amplitudes. |
Beta Was this translation helpful? Give feedback.
tl;dr: You should keep the empty bins and it makes sense that the fit with empty bins included gives you smaller uncertaintes.
Empty bins in the data are completely ok, no need to cut them. Problematic are empty bins in the templates, more precisely, situations in which all templates have zero entries in some bin but the data bin is non-zero. Such bins have to be discarded in the likelihood function, because it is not possible to draw any information from them (all predictions happen to be empty, so there is no way to estimate the amplitudes).
If at least one template is filled for a given bin and the data bin is empty, the Poisson-based template fit can draw information from that. For th…