-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbrda() doesn't return species scores and the documentation is unclear if it should #254
Comments
In I am not quite sure if species scores should be implemented, because I don't know how to do this correctly (this also applies to So this can be done, but should we? |
I think I need to explain the reasons behind my hesitations of species scores in
There are numerous problems with species scores. The way we add them needs metric distances, but not necessarily standard Euclidean. If we have a distance that can be expressed as Euclidean distances of transformed data, then transformed data can be used to find species scores. That is, if our dissimilarity can be expressed The commonly used (semimetric) community dissimilarities cannot be expressed exactly as Euclidean distances. They differ in two major ways: they are often bound to range 0...1, and they use differences instead of squared differences of Euclidean metrics. However, if we can find a transformation that makes them approximately similar to Euclidean distances, we should use data with that transformation to get species scores. For closed range, we can transform SUs to unit total, and we can use As an example, let us have a look at This may be a more acceptable transformation for getting species scores, but We have three obvious alternatives:
|
Dear Jari and vegan developers, I would prefer option three, with some reference to the issue that you so clearly described here in the documentation of the function; And some reference to species scores in the documentation of dbrda. I think it negates the "endorsement" issue you mention, and forces people to make an effort to understand and consider their data. Another option I was wondering about, which is probably out of grave misunderstanding of multiple facets of your incredible work, is: Are their other ways of calculating "species scores" that may circumvent these problems? e.g. Pearson or Spearman rank correlation of "species" to "axis", as used in Primer PERMANOVA. https://stackoverflow.com/questions/46563475/what-is-the-difference-between-r-vegan-species-scores-and-primer-spearman-rank |
I don't know PRIMER and I can't comment on its choices. However, having correlations of "species" to "axis" is something that I exactly don't want to have near vegan. The species scores we have are not for the axes, but they show the direction in the space to which the species increases most rapidly: it is not along axes, but somewhere in the space. These scores are rotation invariant: if you change the orientation of the axis, all points and all scores move together and are unchanged with respect to each other, although their relationship to axes would change. So how would rank correlation change if you rotate an axis? For comparable species scores, you should find a direction to which the species has the best rank correlation, and I don't know how to do that. We can do it in Euclidean space using Euclidean geometry like we do now, but I have no idea about rank-order metric. |
@jarioksa what about If well documented, the function you showed on SO would be fine to include in vegan. |
Yes, indeed: Several people in SO have used |
(Is that a typo in the last sentence? Should it be "Technically this is incorrect"?) What is the concern with using |
No, it is not a typo. This is technically correct: library(vegan)
data(varespec)
mod <- rda(varespec)
biplot(mod)
plot(envfit(mod ~ Pleuschr + Cladarbu + Cladrang + Cladstel, varespec)) # correct
ordisurf(mod ~ Cladstel, varespec, knots=1, add=TRUE) # linear change The point with NMDS really is that the arrows imply a linear species response in the ordination space. We have those in PCA/RDA, but not in NMDS. |
RFC: implementation of adding species scores to ordination objects
|
Together with previous commits in this branch, this implements the functions outlined in github issue #254 for adding species scores to ordination results. Still undocumented and non-exported.
I have now function to add (or update) species scores in The existence of species scores was the main difference between |
Would it make sense to have:
? Also, for consistency we should probably call this I'm not a fan of |
I'm not sure about this line Line 45 in a6710e2
Ditto Line 55 in a6710e2
Also, I'm now wondering if we really need this function at all. Can't we just have each of these functions just add species scores regardless? It seems like we're being too protective of users. Would it not be better to add a Perhaps if we did keep the proposed function it would allow a user to add species scores to be added later if they are doing something very non-standard or bespoke. But we should bake this functionality into the functions that do the ordination so a user needs only to activate a switch when fitting the ordination rather than the two-step process. Thoughts? |
Why have this (or similar) function? It seems that people want to have it. This only concerns distance-based methods ( Then how to do this? I think having The suggested change was based on the idea of amending a solution with species scores. This would only concern methods with may have species scores, but happens not to have them, and this amendment would permanently change the result object (and current version saves no permanent information on the way this amendment was done: which data, how transformed). Another alternative is to supply functions that make easier to add scores in plots without saving them in the result object. For --- a/R/wascores.R
+++ b/R/wascores.R
@@ -1,8 +1,10 @@
`wascores` <-
- function (x, w, expand = FALSE)
+ function (x, w, expand = FALSE, display = "sites", ...)
{
if(any(w < 0) || sum(w) == 0)
stop("weights must be non-negative and not all zero")
+ if (is.list(x))
+ x <- scores(x, display = display, ...)
x <- as.matrix(x)
w <- as.matrix(w)
nc <- ncol(x) For projections we would need a new function. People use now |
If we have something to solve the scores problem, it should be light. Here some ideas and the reasons why I don't like them:
|
@gavinsimpson I started to think that a renamed replacement function makes sense for adding species scores. However, this seems to stretch the R concept of replacement a bit. For instance, the right-hand-side must be called The function works, though, and it is clearer that we add something to ordination result than in the previous version. The replacement functions can only have two arguments: the left one being the object (usually called I'll deal with other aspects later (warnings, file names, documentation etc.). I'll familiarize with this approach first. |
@gavinsimpson @philiphaupt With respect to the discussion here, PR #256 does the following:
|
Closed with PR #256 |
Unlike
capscale()
,dbrda()
doesn't return species scores:It's not immediately clear if
dbrda()
should return these — the documentation doesn't distinguish betweencapscale()
anddbrda()
on this point.Given the similarity between these two methods, I suspect we can just add species scores using the same code from
capscale()
.Either way we should either
?dbrda
to be clearer if there are differences between this function andcapscale()
,or
dbrda()
models.This issue was originally raised on StackOverlfow
The text was updated successfully, but these errors were encountered: