Naming of statistical outputs #125

tyarkoni · 2019-03-21T17:40:15Z

Currently we're using the nistats names for the output images produced by model fitting: ['z_score', 'stat', 'p_value', 'effect_size', 'effect_variance']. Since fitlins is theoretically capable of supporting multiple backends, there's no particular reason we need to adhere to these; we should probably standardize on a set of names that we can reuse everywhere. My take on the names:

z_score and p_value are clear and self-explanatory.
stat: From a quick glance, it looks like this is either t or F, depending on the analysis. I think this is problematic, because you can't tell just from the filename what it represents. I suggest renaming maps to t_stat or F_stat as appropriate.
effect_size: this is technically accurate, but I dislike it because people tend to use the term "effect size" when talking about standardized effects, so there's a risk of confusion. I suggest beta, par_est, parameter_estimate, or just effect.
effect_variance: I'm ambivalent about this one; I think just variance would probably be fine. It's possible that there will be other things we want to call variance, but I think the surrounding context should always make the semantics relatively clear (e.g., an estimate of between-subject variance at a higher level wouldn't have any subject or run id attached).

Tagging @nicholst, @bthirion, @cmaumet, @jbpoline for potential input.

The text was updated successfully, but these errors were encountered:

tyarkoni · 2019-03-21T17:46:46Z

I guess a related question is what happens to the underscores when files are saved. I assume we want these to be BIDS-compliant, so are we just turning p_value into, e.g., pvalue? If so, maybe we should just drop the underscores everywhere.

effigies · 2019-03-21T20:18:13Z

I'd suggest:

effect
variance
zscore
tstat/fstat (I would prefer to keep capitals out)
pval or pvalue

I'm disinclined toward beta simply because we don't really distinguish between parameter estimates and contrasts of parameter estimates anymore. Betas are just identity contrasts.

tyarkoni · 2019-03-21T20:22:14Z

Agree with all of the above. We could probably further simplify and go with z, t, and F (I would also prefer not to use caps, but the [strong] convention is to use F for F-tests, and I think this has the potential to create confusion if we use f).

effigies · 2019-03-21T20:29:53Z

I could go that way, but it might be good to run this one by the larger BIDS community before staking claim to 3 single-character suffixes.

bthirion · 2019-03-21T20:31:00Z

Sorry, but I'd rather avoid 'z', 't', 'F': my experience is that explicit naming shuoold be preferred.
i also find beta a bit statistician jargon.

The reason to keep 'effect_variance' rather than 'variance' is that people may confuse the residual's variance with that of the contrast; this being said, I'm fine with changing, especially if this becomes a community standard.

jbpoline · 2019-03-21T21:05:51Z

We could take those from STATO:

p-value
effect size estimate
z-score
t-statistic
F-statistic
nothing came up for 'effect variance' but of course variance
exists, so 'effect variance' should be fine.

for "white spaces" or "underscore" I am not too sure of what are the possible options, if we cannot use '-' or '_' then Camel case ?

nicholst · 2019-03-21T21:20:22Z

I'm with @bthirion for avoiding beta for the sake of clarity.

Also, I'm not familiar with fitlin... do you need to separate store regression coefficients and contrast estimates?

"effect size" is fraught because (in my experience) psychologists read "effect size" as "standardised effect size". Whatever we do we should be prepared to define:

R² - coefficient of variation
d - Cohen's d - standardised effect size

"variance" to me is also ambiguous... Is it residual error variance or variance of a parameter estimate?

jbpoline · 2019-03-21T21:25:38Z

the advantage of taking the STATO terminology is that you get the definition for free :

coefficient of variation

I could not find something for "standardised effect size" - I suppose it could be added. There are things like

standardised mean difference

but that would need to be generalized ...

tyarkoni · 2019-03-21T22:17:03Z

Okay so it sounds like we have convergence on pval, zscore, tstat, and Fstat. We have near-consensus on effect (JB, for the reason both @nicholst and I identified above, I don't think "effect size" is ideal). Regarding variance, I don't think I see much potential for confusion between model residual error and variance of a PE. The latter will presumably have the effect name it corresponds to, whereas the former will either have nothing, or something like "model" or "residual".

In general, I don't think we need to be very concerned about cases where people in a rush accidentally pick out the wrong map based on the filename and then catch themselves later (lord knows this already happens enough with other packages); the concern is more about someone who carefully reads the filename being able to understand roughly or exactly what the map represents.

@nicholst I believe (but @effigies can correct me) that we don't maintain any distinction between PE maps and contrast maps. The only outputs at each level of analysis are those defined as contrasts. If you don't define contrasts, you get nothing.

tyarkoni · 2019-03-21T22:20:11Z

@jbpoline I'm not sure the STATO terminology works for us, if only because it seems very verbose. Do we really want things like ses-movie_task-movie_run-1_bold_contrast-faceConfidence_StandardizedMeanDifference.nii.gz? I'm not saying we don't, necessarily, it just feels quite user- (and developer) unfriendly.

jbpoline · 2019-03-22T00:35:59Z

I actually did not suggest StandardizedMeanDifference :)
but tstat or tstatistics and the others should be ok, not sure a few characters make a big difference here.
I agree also that effect size is not ideal - but at least if we have a place where it is defined that alleviates some of the issue. I like parameter estimate (again, a bit long ? but I am not sure we should care), STATO even has contrast estimate
I think whatever we choose, giving it a link to a proper definition would be good !

adelavega · 2019-03-22T22:45:54Z

I think a 1:1 shorthand for STATO terms might work. I have a strong preference for shorter filenames.

jbpoline · 2019-03-23T16:41:39Z

I also prefer shorter name - but if we have already quite a bit of information coded in the filename (sub-XX_ses-XX_.... ) so not too sure this will make a big difference. But yes, shorter is better if we have the 1:1 mapping

effigies · 2019-04-19T14:14:03Z

How do people feel about adding a test-<t|F|chi2|...> type of entity? A generic stat suffix would replace fstat and tstat (or F and t, depending on what our final decision was...) and this might also make the zscore and other suffixes more informative as individual filenames. As we've mapped this out, you would kind of need to reference the associated statistical filename to quickly grasp what kind of effect map you're looking at.

I want to rope in @Gilles86 and @JohnGriffiths, as there was some discussion at the recent computational models meeting about statistical outputs, to see if they have any thoughts on this or the discussion as a whole.

Obviously nothing's set in stone, but identifying paths that are going to be problematic when we start looking outside our narrow use case seems prudent.

tyarkoni · 2019-04-19T15:10:40Z

I think a stat suffix is a good idea, but I think that should subsume all existing statistic suffixes (e.g., _zscore, _pvalue...), not just t and F stats. By the same token, I would want to have the entity named stat rather than test, as the latter is pretty limiting. I can see an argument for wanting to separately encode the procedure that generated a statistic from the statistic itself (e.g., a p-value can come from a t-test, F-test, or any number of other things), but in practice I think that's best handled via the desc field, or another new freeform field like proc or model, as the space of common procedures is just too large to reasonably encode in a limited vocabulary.

Also, in most cases the interpretation of the statistic itself doesn't really depend on knowing the procedure. E.g., one-sample and two-sample t-tests can be thought of as different procedures (or as two minor variants of a GLM analysis), but the resulting/derived t, z, p, and stderr maps have essentially the same meaning regardless.

I guess what I'm proposing is something like sub-01_task-nback_run-1_contrast-parametricload_desc-onesampttest_stat-p_statmap.nii.gz, which would be read as "a map of p-values obtained from a one-sample t-test on the parametric load contrast for subject 1, run 1, during the n-back task."

EDIT: And just to be clear, on this proposal, the standardized names proposed above would all move from suffix to the stat entity—i.e., *_stat-zscore_statmap.nii.gz rather than *_zscore.nii.gz.

EDIT2: One other reason I prefer this approach is that I think we should work to keep the number of valid suffixes created BIDS-wide to a minimum. IMO extensions that need to introduce a whole bunch of new distinctions that all have some common logical grouping (like statistic type) should get one suffix and a new entity rather than a bunch of suffixes and no new entities.

cmaumet · 2019-04-19T15:26:40Z

Sorry for jumping in just now... But I'm a little bit confused as to what is the goal here.

Are we defining:

the names of the output files for fitlins (in which case I think we have a lot of freedom) -or-
the BIDS specification for statistical outputs (in which case the discussion is much broader and we should spend some time thinking how it fits w/ other standards (NIDM-Results) but also bring the discussion up to the BIDS mailing list / BEP discussion channel)

?

I think this is an important distinction as it has consequences on how much we should think about it and more practically on how/where to have this discussion.

effigies · 2019-04-19T15:36:29Z

@cmaumet Mostly 1, but it seems like a good idea to do 1 with 2 in the back of one's mind, to improve the odds that moving to the standardization stage will not require massive rearchitecting. I'm taking as a model that we made a lot of pragmatic decisions in fMRIPrep that later informed the derivatives spec meeting, but were not binding on that group. It turned out that some changes were needed, but they were pretty minor.

One of the results of the computational model meeting was the notion that we should have a statistical outputs BEP, but I do not believe that the drafting has begun. You can see the sketch of a few of the sorts of outputs we were discussing in the document I linked above.

effigies · 2019-04-19T16:51:52Z

@tyarkoni I get that p-values and z-scores are kind of test-agnostic, but still having the context of the statistic they're describing seems useful, and your expansion seems overly verbose for what's needed here. What about:

sub-01_task-nback_run-1_contrast-parametricload_stat-t_estimate.nii.gz
sub-01_task-nback_run-1_contrast-parametricload_stat-t_variance.nii.gz
sub-01_task-nback_run-1_contrast-parametricload_stat-t_stat.nii.gz
sub-01_task-nback_run-1_contrast-parametricload_stat-t_pvalue.nii.gz
sub-01_task-nback_run-1_contrast-parametricload_stat-t_zscore.nii.gz

Or what would be your proposal for the five? I've seen them for p and z.

tyarkoni · 2019-04-19T17:35:25Z

@effigies I think this relates to @cmaumet's question. If we see this is as just a fitlins convention, and we don't expect there to ever be anything more than these five suffixes, that would be fine. But going forward I can certainly see a use case for wanting other maps (e.g., standard errors, standardized effect size measures like d or r, etc.). The thing I think we desperately want to avoid is having to have a (large) controlled vocabulary of statistical suffixes. Whereas having a new uncontrolled entity like stat is not a big deal.

I'm less concerned about verbosity (well, within reason) than about complexity and clarity for the spec and tools that implement the spec. Having a single statmap suffix is nice because it immediately orients the receiving tool that it needs to speak stats in order to handle the file (and can be ignored if desired). Developers, similarly, don't have to know the spec well to decide that _statmap is not something they care about (whereas _estimate and _variance are generic enough that it's not hard to imaging semantic collisions or confusions across different parts of the ecosystem).

FWIW, I find your examples actually a bit misleading, in that:

stat-t immediately makes me think that the value of the statistic in the map is a t score (which it isn't), and not that the statistic came from a t test (renaming to test-t would be an improvement, but as soon as you add other test types, the confusion returns; e.g., would people understand test-r or test-pearson?)
I think it's quite confusing to have a generic _stat suffix whose meaning varies based on the originating procedure, and doesn't actually tell you what the value is. I.e., *_test-t_stat.nii.gz contains t scores and *_test-F_stat.nii.gz contains F scores, but *_test-t_pvalue.nii.gz and _test-F_pvalue.nii.gz both contain p-values. Separating into a _statmap suffix and a stat entity is unambiguous in that the suffix tells you the map contains statistical quantities, the stat tells you what those values are, and desc or some other field (or associated metadata) tells you where they came from.

effigies · 2019-04-19T17:48:01Z

Fair enough. Still unclear on what you want to see. This?

sub-01_task-nback_run-1_contrast-parametricload_stat-estimate_statmap.nii.gz
sub-01_task-nback_run-1_contrast-parametricload_stat-variance_statmap.nii.gz
sub-01_task-nback_run-1_contrast-parametricload_stat-t_statmap.nii.gz
sub-01_task-nback_run-1_contrast-parametricload_stat-p_statmap.nii.gz
sub-01_task-nback_run-1_contrast-parametricload_stat-z_statmap.nii.gz

tyarkoni · 2019-04-19T17:50:56Z

Yeah, that looks good to me. If we're allowing that this is just provisional fitlins nomenclature, and may or may not make it into a stats spec later, then I'm fine adding a test entity that currently only takes values t and F. But I'm also fine putting that in desc if we're not using it for anything else (and that might be better actually, because I think desc-Ftest reads a bit better than test-F).

effigies · 2019-04-19T17:53:54Z

In the case of an F-test, it would be:

sub-01_task-nback_run-1_contrast-parametricload_stat-F_statmap.nii.gz

I'm kind of ignoring desc for right now, though if it's important we can stipulate that it will be added/argued over before merge, as it will apply to all of the files or none, and I want to get something working before tweaking it.

bthirion · 2019-04-19T20:49:22Z

IMHO, this is indeed confusing and should be omitted, I mean I would only use stat-t label for an image that represents a t statistic. When you want to understand a map, you need to get a sense of what was tested "the contrast" and the actual nature of the statistic (t, F, z, p etc. ---bzw I strongly encourage to rely systematically on z values, as t/F require to specify the dofs, which requires other conventions). I don't really see the bonus in noting that this comes from a t or F test (for multiple dimensions, only F is possible anyhow, and in 1d, they are almost equivalent, not exactly of course because of the sign question).

I would put this in metadata. My 2c, B

jbpoline · 2019-04-21T19:12:35Z

I agree with that, I would only have _stat-t_ represent t statistics, _stat-z_ z stats _stat-pvalue_ p-values _stat-variance_ variance estimates possibly with other key-values to specify which variance it is about _stat-contrastestim_ .. not 100% sure how to use the suffix here. May be _statmap as a grouping of all statistical map results ? cheers JB

…

On Fri, Apr 19, 2019 at 4:49 PM bthirion ***@***.***> wrote: > FWIW, I find your examples actually a bit misleading, in that: > > * |stat-t| immediately makes me think that the value of the > statistic in the map is a |t| score (which it isn't), and not that > the statistic /came from/ a t test (renaming to |test-t| would be > an improvement, but as soon as you add other test types, the > confusion returns; e.g., would people understand |test-r| or > |test-pearson|?) > IMHO, this is indeed confusing and should be omitted, I mean I would only use stat-t label for an image that represents a t statistic. When you want to understand a map, you need to get a sense of what was tested "the contrast" and the actual nature of the statistic (t, F, z, p etc. ---bzw I strongly encourage to rely systematically on z values, as t/F require to specify the dofs, which requires other conventions). I don't really see the bonus in noting that this comes from a t or F test (for multiple dimensions, only F is possible anyhow, and in 1d, they are almost equivalent, not exactly of course because of the sign question). > * I think it's quite confusing to have a generic |_stat| suffix > whose meaning varies based on the originating procedure, and > doesn't actually tell you what the value is. I.e., > |*_test-t_stat.nii.gz| contains t scores and > |*_test-F_stat.nii.gz| contains F scores, but > |*_test-t_pvalue.nii.gz| and |_test-F_pvalue.nii.gz| both contain > p-values. Separating into a |_statmap| suffix and a |stat| entity > is unambiguous in that the suffix tells you the map contains > statistical quantities, the |stat| tells you what those values > are, and |desc| or some other field (or associated metadata) tells > you where they came from. > I would put this in metadata. My 2c, B — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#125 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AACDE2H3D7ZR65JFKXENEFTPRIV5JANCNFSM4HAIASYA> .

tyarkoni mentioned this issue Mar 21, 2019

First-level outputs #124

Closed

effigies added this to the OHBM 2019 milestone Apr 12, 2019

effigies mentioned this issue Apr 18, 2019

ENH: Produce all available statistical maps from each analysis unit #131

Merged

franklin-feingold added high effort medium impact labels Apr 18, 2019

effigies added high impact medium effort and removed high effort medium impact labels Apr 19, 2019

effigies closed this as completed in #131 Apr 30, 2019

effigies mentioned this issue Sep 18, 2019

Models, derived parameters and BIDS suffixes bids-standard/bids-bep016#12

Closed

effigies mentioned this issue Sep 19, 2022

"stat" entity bids-standard/bids-bep016#61

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Naming of statistical outputs #125

Naming of statistical outputs #125

tyarkoni commented Mar 21, 2019

tyarkoni commented Mar 21, 2019

effigies commented Mar 21, 2019

tyarkoni commented Mar 21, 2019

effigies commented Mar 21, 2019

bthirion commented Mar 21, 2019

jbpoline commented Mar 21, 2019 •

edited

Loading

nicholst commented Mar 21, 2019

jbpoline commented Mar 21, 2019 •

edited

Loading

tyarkoni commented Mar 21, 2019

tyarkoni commented Mar 21, 2019

jbpoline commented Mar 22, 2019 •

edited

Loading

adelavega commented Mar 22, 2019

jbpoline commented Mar 23, 2019

effigies commented Apr 19, 2019

tyarkoni commented Apr 19, 2019 •

edited

Loading

cmaumet commented Apr 19, 2019

effigies commented Apr 19, 2019

effigies commented Apr 19, 2019

tyarkoni commented Apr 19, 2019

effigies commented Apr 19, 2019

tyarkoni commented Apr 19, 2019

effigies commented Apr 19, 2019

bthirion commented Apr 19, 2019 via email

jbpoline commented Apr 21, 2019 via email

Naming of statistical outputs #125

Naming of statistical outputs #125

Comments

tyarkoni commented Mar 21, 2019

tyarkoni commented Mar 21, 2019

effigies commented Mar 21, 2019

tyarkoni commented Mar 21, 2019

effigies commented Mar 21, 2019

bthirion commented Mar 21, 2019

jbpoline commented Mar 21, 2019 • edited Loading

nicholst commented Mar 21, 2019

jbpoline commented Mar 21, 2019 • edited Loading

tyarkoni commented Mar 21, 2019

tyarkoni commented Mar 21, 2019

jbpoline commented Mar 22, 2019 • edited Loading

adelavega commented Mar 22, 2019

jbpoline commented Mar 23, 2019

effigies commented Apr 19, 2019

tyarkoni commented Apr 19, 2019 • edited Loading

cmaumet commented Apr 19, 2019

effigies commented Apr 19, 2019

effigies commented Apr 19, 2019

tyarkoni commented Apr 19, 2019

effigies commented Apr 19, 2019

tyarkoni commented Apr 19, 2019

effigies commented Apr 19, 2019

bthirion commented Apr 19, 2019 via email

jbpoline commented Apr 21, 2019 via email

jbpoline commented Mar 21, 2019 •

edited

Loading

jbpoline commented Mar 21, 2019 •

edited

Loading

jbpoline commented Mar 22, 2019 •

edited

Loading

tyarkoni commented Apr 19, 2019 •

edited

Loading