You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
where the nNeighbors <= this.nearest.length is intended to check
whether the cached KNN computation was computed with a value of nNeighbors not smaller than the currently requested value. But in fact this.nearest.length is the number of data points, not the number of
neighbors computed for each point.
To verify, add
console.log(`Found ${nNeighbors}-nearest: shape is `+`(${this.nearest.length}, ${this.nearest[0].length})`);
below line 384 in projectUmap. Then, run UMAP, and subsequently re-run
UMAP with a higher value of k. The output is (e.g.)
Found 15-nearest: shape is (1024, 15)
Found 23-nearest: shape is (1024, 15)
which is wrong, because after attempting to find the 23-nearest
neighbors we only have 15 elements for each data point.
I’m not sure why this never hits a hard error anywhere in the
pipeline—implicit conversion of undefined to 0/NaN somewhere?—but
it definitely causes observable effects. To observe, patch in #2080* to
prevent t-SNE from running automatically when its tab is selected, then:
in one tab, load the projector page and run t-SNE with
perplexity=100;
in another tab, load the projector page and run t-SNE with
perplexity=8, then re-run it with perplexity=100.
These two tabs should yield approximately the same projection, but
instead the projection in the first tab converges to a normal result
whereas the projection in the second tab diverges to points far apart
with little visible structure:
Missed this in my review of #1901. The cache key check is incorrect. The
code reads:
tensorboard/tensorboard/plugins/projector/vz_projector/data.ts
Lines 444 to 447 in 9270699
where the
nNeighbors <= this.nearest.length
is intended to checkwhether the cached KNN computation was computed with a value of
nNeighbors
not smaller than the currently requested value. But in factthis.nearest.length
is the number of data points, not the number ofneighbors computed for each point.
To verify, add
below line 384 in
projectUmap
. Then, run UMAP, and subsequently re-runUMAP with a higher value of k. The output is (e.g.)
which is wrong, because after attempting to find the 23-nearest
neighbors we only have 15 elements for each data point.
I’m not sure why this never hits a hard error anywhere in the
pipeline—implicit conversion of
undefined
to0
/NaN
somewhere?—butit definitely causes observable effects. To observe, patch in #2080* to
prevent t-SNE from running automatically when its tab is selected, then:
in one tab, load the projector page and run t-SNE with
perplexity=100;
in another tab, load the projector page and run t-SNE with
perplexity=8, then re-run it with perplexity=100.
These two tabs should yield approximately the same projection, but
instead the projection in the first tab converges to a normal result
whereas the projection in the second tab diverges to points far apart
with little visible structure:
* Tested at commit b0310cd.
The text was updated successfully, but these errors were encountered: