Script to merge volume tracing into on-disk segmentation #3431

fm3 · 2018-11-05T13:50:37Z

Steps to test:

create a back up of a dataset with a segmentation layer (mag1)
volumetrace some, download as zip
run python3 main.py my-volumetracing.zip ../../binaryData/Connectomics_Department/ROI2017_wkw_edit_segmentation/segmentation/1/
confirm prompt
output for me looks like

Found 4 tracing files, which will affect 2 segmentation files
Reading segmentation file 1 of 2, bounding box: ([0, 0, 0], [1024, 1024, 1024])...
  Overwriting tracing buckets in memory...
    Overwriting ([512, 480, 480], [32, 32, 32])
    Overwriting ([512, 512, 480], [32, 32, 32])
    Overwriting ([512, 512, 512], [32, 32, 32])
  Writing segmentation file back to disk...
Reading segmentation file 2 of 2, bounding box: ([1024, 0, 0], [1024, 1024, 1024])...
  Overwriting tracing buckets in memory...
    Overwriting ([1024, 512, 480], [32, 32, 32])
  Writing segmentation file back to disk...
Done.

Issues:

fixes Script to merge Volume Tracings into Segmentation Layer on disk #3343

Ready for review

fm3 · 2018-11-06T11:32:16Z

open questions:

should we support volume tracings with other magnifications?
should we downsample the new segmentation layers for other magnifications?

jfrohnhofen · 2018-11-08T14:28:14Z

tools/merge-volume-into-data-layer/main.py

+    tracing_tmpdir_path = extract_tracing_zip(args)
+
+    tracing_dataset = wkw.Dataset.open(os.path.join(tracing_tmpdir_path, '1'))
+    segmentation_dataset = wkw.Dataset.open(os.path.join(args.segmentation_path))


Maybe add 1 to segmentation path here as well (and don't require the user to provide it). This prevents users to accidentally merge data from different zoom levels.

jfrohnhofen · 2018-11-08T14:28:31Z

tools/merge-volume-into-data-layer/main.py

+    tracing_dataset = wkw.Dataset.open(os.path.join(tracing_tmpdir_path, '1'))
+    segmentation_dataset = wkw.Dataset.open(os.path.join(args.segmentation_path))
+
+    assert(tracing_dataset.header.num_channels == segmentation_dataset.header.num_channels)


also assert data type?

jfrohnhofen · 2018-11-08T14:31:10Z

tools/merge-volume-into-data-layer/main.py

+
+        print("  Writing segmentation file back to disk...")
+        segmentation_dataset.write([0, 0, 0], data)
+        count = count + 1


maybe use enumerate (http://book.pythontips.com/en/latest/enumerate.html) instead

jfrohnhofen · 2018-11-08T14:31:59Z

tools/merge-volume-into-data-layer/main.py

+            topleft = list(map(lambda x: x % segmentation_file_len_voxels, tracing_bbox[0]))
+            shape = tracing_bbox[1]
+            bottomright = list( map(add, topleft, shape) )
+            # print("      Broatcasting to 0:1 {}:{}, {}:{}, {}:{}".format(topleft[0], bottomright[0], topleft[1], bottomright[1], topleft[2], bottomright[2]))


jfrohnhofen · 2018-11-08T14:33:08Z

tools/merge-volume-into-data-layer/main.py

+        else:
+            extract_data_zip(args.tracing_path)
+
+    tracing_tmpdir_path = 'tmp-67X8KZUFP0'


generate randomly? this might otherwise cause problems when merging multiple tracings at the same time

ok, i see you have tmp_filename but don't use it here ;)

jfrohnhofen · 2018-11-08T14:33:26Z

tools/merge-volume-into-data-layer/main.py

+    grouped = {}
+    for tracing_bbox in tracing_bboxes:
+        segmentation_bbox = matching_segmentation_bbox(segmentation_bboxes, tracing_bbox)
+        str(segmentation_bbox)


unused. remove?

jfrohnhofen · 2018-11-08T15:11:30Z

do we delete the temp folder after we are done?

jfrohnhofen · 2018-11-09T08:53:00Z

@fm3 When I test this locally I get:

Traceback (most recent call last):
  File "main.py", line 129, in <module>
    main()
  File "main.py", line 18, in main
    tracing_tmpdir_path = extract_tracing_zip(args)
  File "main.py", line 72, in extract_tracing_zip
    extract_data_zip(outfile_path, tracing_tmpdir_path)
  File "main.py", line 81, in extract_data_zip
    with zipfile.ZipFile(path) as file:
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/zipfile.py", line 1200, in __init__
    self._RealGetContents()
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/zipfile.py", line 1267, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

The script creates a temp folder, though, that contains a data.zip file and macOS is perfectly happy to unpack this zip file.
Any ides? Maybe a macOS issue? Should I investigate?

fm3 · 2018-11-09T12:07:59Z

I incorporated your feedback, thanks! :)
If you have any idea what’s up withe zipfile on mac, it would be appreciated

jfrohnhofen · 2018-11-09T16:14:18Z

I committed a fix for macOS. Pleas have a look this does not break anything on your side.
How long does merging a 1024^3 cube take for you?
I had 2:30 minutes which I think is surprisingly long.
Most time is consumed by writing the new wkw file though, so maybe a problem in the library?
Otherwise looking good!

fm3 · 2018-11-09T16:17:39Z

timing is the same for me, I suppose that’s writing the compressed data?
I’ll test your fix next week (will have no internet over the weekend)

jfrohnhofen · 2018-11-09T16:20:09Z

Ok, the timing is still weird. Reading the file should be faster than writing, but not 100 times faster!? Anyhow, not important for this PR, but depending on user feedback we could/should investigate a bit more.

* master: Fix rgb support (#3455) Fix docker uid/gid + binaryData permissions. Persist postgres db (#3428) Script to merge volume tracing into on-disk segmentation (#3431) Hotfix for editing TaskTypes (#3451) fix keyboardjs module (#3450) Fix guessed dataset boundingbox for non-zero-aligned datasets (#3437) voxeliterator now checks if the passed map has elements (#3405) integrate .importjs (#3436) Re-write logic for selecting zoom level and support non-uniform buckets per dimension (#3398) fixing selecting bug and improving style of layout dropdown (#3443) refresh screenshots (#3445) Reduce the free space between viewports in tracing (#3333) Scala linter and formatter (#3357) ignore reported datasets of non-existent organization (#3438) Only provide shortcut for tree search and not for comment search (#3407) Update Datastore+Tracingstore Standalone Deployment Templates (#3424) In yarn refresh-schema, also invalidate Tables.scala (#3430) Remove BaseDirService that watched binaryData symlinks (#3416) Ensure that resolutions array is dense (#3406) Fix bucket-collection related rendering bug (#3409)

[WIP] script to merge volume tracing into on-disk segmentation

aa8c046

fm3 self-assigned this Nov 5, 2018

fm3 added the backend label Nov 6, 2018

fm3 requested review from jfrohnhofen and normanrz November 6, 2018 11:30

fm3 changed the title ~~[WIP] Script to merge volume tracing into on-disk segmentation~~ Script to merge volume tracing into on-disk segmentation Nov 6, 2018

fm3 and others added 2 commits November 6, 2018 12:34

group by segmentation files, overwrite the necessary files

62455f1

Merge branch 'master' into merge-volume-tracing-on-disk

8e533f1

jfrohnhofen reviewed Nov 8, 2018

View reviewed changes

Merge branch 'master' into merge-volume-tracing-on-disk

bf0b788

implement pr feedback

a9c9bda

jfrohnhofen approved these changes Nov 9, 2018

View reviewed changes

Merge branch 'master' into merge-volume-tracing-on-disk

63eec44

fm3 merged commit d173cc5 into master Nov 12, 2018

fm3 deleted the merge-volume-tracing-on-disk branch November 12, 2018 13:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Script to merge volume tracing into on-disk segmentation #3431

Script to merge volume tracing into on-disk segmentation #3431

fm3 commented Nov 5, 2018 •

edited

Loading

fm3 commented Nov 6, 2018

jfrohnhofen Nov 8, 2018

jfrohnhofen Nov 8, 2018

jfrohnhofen Nov 8, 2018

jfrohnhofen Nov 8, 2018

jfrohnhofen Nov 8, 2018

jfrohnhofen Nov 8, 2018

jfrohnhofen Nov 8, 2018

jfrohnhofen commented Nov 8, 2018

jfrohnhofen commented Nov 9, 2018 •

edited

Loading

fm3 commented Nov 9, 2018

jfrohnhofen commented Nov 9, 2018 •

edited

Loading

fm3 commented Nov 9, 2018

jfrohnhofen commented Nov 9, 2018

Script to merge volume tracing into on-disk segmentation #3431

Script to merge volume tracing into on-disk segmentation #3431

Conversation

fm3 commented Nov 5, 2018 • edited Loading

Steps to test:

Issues:

fm3 commented Nov 6, 2018

jfrohnhofen Nov 8, 2018

Choose a reason for hiding this comment

jfrohnhofen Nov 8, 2018

Choose a reason for hiding this comment

jfrohnhofen Nov 8, 2018

Choose a reason for hiding this comment

jfrohnhofen Nov 8, 2018

Choose a reason for hiding this comment

jfrohnhofen Nov 8, 2018

Choose a reason for hiding this comment

jfrohnhofen Nov 8, 2018

Choose a reason for hiding this comment

jfrohnhofen Nov 8, 2018

Choose a reason for hiding this comment

jfrohnhofen commented Nov 8, 2018

jfrohnhofen commented Nov 9, 2018 • edited Loading

fm3 commented Nov 9, 2018

jfrohnhofen commented Nov 9, 2018 • edited Loading

fm3 commented Nov 9, 2018

jfrohnhofen commented Nov 9, 2018

fm3 commented Nov 5, 2018 •

edited

Loading

jfrohnhofen commented Nov 9, 2018 •

edited

Loading

jfrohnhofen commented Nov 9, 2018 •

edited

Loading