Implement viewing sharded neuroglancer precomputed datasets #6920

frcroth · 2023-03-15T10:59:41Z

URL of deployed dev instance (used for testing):

https://sharding.webknossos.xyz

Steps to test:

Sharded datasets to explore and view:
- gs://h01-release/data/20210601/4nm_raw
- See neuroglancer for comparison

Issues:

fixes Support sharding for remote datasets #6823

(Please delete unneeded items, merge only when none are left open)

Updated changelog
Needs datastore update after deployment

…ogram sampled positions

fm3 · 2023-03-21T13:24:10Z

I managed to improve the performance :) The main problem was that Array[Int] does not work as a cache key, because java arrays are not automatically equal if they have the same content. Sorry for suggesting it in the first place 🙈 I changed it to string again now.

In the process I also increased parallelization by fetching the source chunks for a given read requests in parallel (Future.sequence rather than Fox.serialCombined) and I also noticed that the histogram sampled positions aren’t aligned with the 32 grid, so they used way more source chunks than intended. I added a commit adressing all these things. I’ll add a review for the rest of the code in the coming days :)

fm3

Cool stuff! Nice dealing with the sharding metadata!
I added a question dealing with unifying the sharded vs non-sharded reading code, otherwise this already looks very good :)

fm3 · 2023-03-23T10:48:17Z

...re/app/com/scalableminds/webknossos/datastore/datareaders/precomputed/PrecomputedArray.scala

+    for {
+      bytes <- Fox.option2Fox(shardPath.readBytes(Some(shardIndexRange)))
+    } yield bytes


Suggested change

for {

bytes <- Fox.option2Fox(shardPath.readBytes(Some(shardIndexRange)))

} yield bytes

Fox.option2Fox(shardPath.readBytes(Some(shardIndexRange)))

should be identical

fm3 · 2023-03-23T10:53:42Z

webknossos-datastore/app/com/scalableminds/webknossos/datastore/datareaders/DatasetArray.scala

+    if (header.isSharded) {
+      for {
+        chunkData: Array[Byte] <- readShardedChunk(chunkIndex)
+        chunkShape = header.chunkSizeAtIndex(chunkIndex)
+        multiArray <- chunkReader.parseChunk(chunkData, chunkShape)
+      } yield multiArray
+    } else {
+      val chunkFilename = getChunkFilename(chunkIndex)
+      val chunkFilePath = relativePath.resolve(chunkFilename)
+      val storeKey = chunkFilePath.storeKey
+      val chunkShape = header.chunkSizeAtIndex(chunkIndex)
+      chunkReader.read(storeKey, chunkShape)


I’m wondering whether we can get the two branches here into one. As I understand, currently, both branches have their own version of reading from the store, decompressing, then typing. Could it be unified? Maybe the sharding implementation could just return the chunk path plus byte range to be passed to the existing chunkReader? (With non-sharding returning None for the range)

fm3

Works for me :) Thanks for addressing the feedback

MIGRATIONS.released.md

…come-toast * 'master' of github.com:scalableminds/webknossos: Log all details on deleting annotation layer (#6950) fix typo Rename demo instance to wkorg instance (#6941) Add LOD mesh support for frontend (#6909) Fix layout of view mode switcher and move it (#6949) VaultPath no longer extends nio.Path (#6942) Release 23.04.0 (#6945) Use new zip.js version to allow zip64 uploads (#6939) Implement viewing sharded neuroglancer precomputed datasets (#6920) Reject dataset uploads if organization storage quota is exceeded (#6893) Refactor deprecated antd Dropdown menus (#6898)

…wings * 'master' of github.com:scalableminds/webknossos: updates docs for docker installation (#6963) Fix misc stuff when viewing tasks/annotations of another user (#6957) Remove segment from list and add undo/redo for segments (#6944) Log all details on deleting annotation layer (#6950) fix typo Rename demo instance to wkorg instance (#6941) Add LOD mesh support for frontend (#6909) Fix layout of view mode switcher and move it (#6949) VaultPath no longer extends nio.Path (#6942) Release 23.04.0 (#6945) Use new zip.js version to allow zip64 uploads (#6939) Implement viewing sharded neuroglancer precomputed datasets (#6920) Reject dataset uploads if organization storage quota is exceeded (#6893) Refactor deprecated antd Dropdown menus (#6898)

frcroth added 5 commits March 20, 2023 16:47

Implement compressed morton code

d7ddc07

Implement sharding based on webknossos-connect

89ccc3d

Fix some bugs in sharding

352294f

Fix sharding for mags with only one shard file

a81d1e6

Use chunkContentsCache for sharded data

4351977

frcroth force-pushed the sharding branch from 1b67ef9 to 4351977 Compare March 20, 2023 15:49

Add comments

39bd078

frcroth requested a review from fm3 March 20, 2023 16:57

frcroth changed the title ~~WIP: Neuroglancer Precomputed Sharding~~ Implement viewing sharded neuroglancer precomputed datasets Mar 20, 2023

frcroth self-assigned this Mar 20, 2023

Perf: fix chunk cache key, load source chunks in parallel, align hist…

f799422

…ogram sampled positions

fm3 reviewed Mar 23, 2023

View reviewed changes

frcroth and others added 3 commits March 27, 2023 09:31

Merge branch 'master' into sharding

5ac2fb7

Use one interface of chunk reader for sharded and unsharded

84979ec

Update changelog

ae080a4

frcroth marked this pull request as ready for review March 27, 2023 08:21

frcroth requested a review from fm3 March 27, 2023 08:21

fm3 approved these changes Mar 27, 2023

View reviewed changes

fm3 reviewed Mar 27, 2023

View reviewed changes

MIGRATIONS.released.md Show resolved Hide resolved

Merge branch 'master' into sharding

2f2d15c

frcroth merged commit d46eb37 into master Mar 27, 2023

frcroth deleted the sharding branch March 27, 2023 09:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement viewing sharded neuroglancer precomputed datasets #6920

Implement viewing sharded neuroglancer precomputed datasets #6920

frcroth commented Mar 15, 2023 •

edited

Loading

fm3 commented Mar 21, 2023

fm3 left a comment

fm3 Mar 23, 2023

fm3 Mar 23, 2023

fm3 left a comment •

edited

Loading

Implement viewing sharded neuroglancer precomputed datasets #6920

Implement viewing sharded neuroglancer precomputed datasets #6920

Conversation

frcroth commented Mar 15, 2023 • edited Loading

URL of deployed dev instance (used for testing):

Steps to test:

Issues:

fm3 commented Mar 21, 2023

fm3 left a comment

Choose a reason for hiding this comment

fm3 Mar 23, 2023

Choose a reason for hiding this comment

fm3 Mar 23, 2023

Choose a reason for hiding this comment

fm3 left a comment • edited Loading

Choose a reason for hiding this comment

frcroth commented Mar 15, 2023 •

edited

Loading

fm3 left a comment •

edited

Loading