Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script to merge volume tracing into on-disk segmentation #3431

Merged
merged 6 commits into from
Nov 12, 2018

Conversation

fm3
Copy link
Member

@fm3 fm3 commented Nov 5, 2018

Steps to test:

  • create a back up of a dataset with a segmentation layer (mag1)
  • volumetrace some, download as zip
  • run python3 main.py my-volumetracing.zip ../../binaryData/Connectomics_Department/ROI2017_wkw_edit_segmentation/segmentation/1/
  • confirm prompt
  • output for me looks like
Found 4 tracing files, which will affect 2 segmentation files
Reading segmentation file 1 of 2, bounding box: ([0, 0, 0], [1024, 1024, 1024])...
  Overwriting tracing buckets in memory...
    Overwriting ([512, 480, 480], [32, 32, 32])
    Overwriting ([512, 512, 480], [32, 32, 32])
    Overwriting ([512, 512, 512], [32, 32, 32])
  Writing segmentation file back to disk...
Reading segmentation file 2 of 2, bounding box: ([1024, 0, 0], [1024, 1024, 1024])...
  Overwriting tracing buckets in memory...
    Overwriting ([1024, 512, 480], [32, 32, 32])
  Writing segmentation file back to disk...
Done.

Issues:


  • Ready for review

@fm3 fm3 self-assigned this Nov 5, 2018
@fm3 fm3 added the backend label Nov 6, 2018
@fm3 fm3 changed the title [WIP] Script to merge volume tracing into on-disk segmentation Script to merge volume tracing into on-disk segmentation Nov 6, 2018
@fm3
Copy link
Member Author

fm3 commented Nov 6, 2018

open questions:

  • should we support volume tracings with other magnifications?
  • should we downsample the new segmentation layers for other magnifications?

tracing_tmpdir_path = extract_tracing_zip(args)

tracing_dataset = wkw.Dataset.open(os.path.join(tracing_tmpdir_path, '1'))
segmentation_dataset = wkw.Dataset.open(os.path.join(args.segmentation_path))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add 1 to segmentation path here as well (and don't require the user to provide it). This prevents users to accidentally merge data from different zoom levels.

tracing_dataset = wkw.Dataset.open(os.path.join(tracing_tmpdir_path, '1'))
segmentation_dataset = wkw.Dataset.open(os.path.join(args.segmentation_path))

assert(tracing_dataset.header.num_channels == segmentation_dataset.header.num_channels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also assert data type?


print(" Writing segmentation file back to disk...")
segmentation_dataset.write([0, 0, 0], data)
count = count + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topleft = list(map(lambda x: x % segmentation_file_len_voxels, tracing_bbox[0]))
shape = tracing_bbox[1]
bottomright = list( map(add, topleft, shape) )
# print(" Broatcasting to 0:1 {}:{}, {}:{}, {}:{}".format(topleft[0], bottomright[0], topleft[1], bottomright[1], topleft[2], bottomright[2]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

else:
extract_data_zip(args.tracing_path)

tracing_tmpdir_path = 'tmp-67X8KZUFP0'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generate randomly? this might otherwise cause problems when merging multiple tracings at the same time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i see you have tmp_filename but don't use it here ;)

grouped = {}
for tracing_bbox in tracing_bboxes:
segmentation_bbox = matching_segmentation_bbox(segmentation_bboxes, tracing_bbox)
str(segmentation_bbox)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused. remove?

@jfrohnhofen
Copy link
Contributor

do we delete the temp folder after we are done?

@jfrohnhofen
Copy link
Contributor

jfrohnhofen commented Nov 9, 2018

@fm3 When I test this locally I get:

Traceback (most recent call last):
  File "main.py", line 129, in <module>
    main()
  File "main.py", line 18, in main
    tracing_tmpdir_path = extract_tracing_zip(args)
  File "main.py", line 72, in extract_tracing_zip
    extract_data_zip(outfile_path, tracing_tmpdir_path)
  File "main.py", line 81, in extract_data_zip
    with zipfile.ZipFile(path) as file:
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/zipfile.py", line 1200, in __init__
    self._RealGetContents()
  File "/usr/local/Cellar/python/3.7.0/Frameworks/Python.framework/Versions/3.7/lib/python3.7/zipfile.py", line 1267, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

The script creates a temp folder, though, that contains a data.zip file and macOS is perfectly happy to unpack this zip file.
Any ides? Maybe a macOS issue? Should I investigate?

@fm3
Copy link
Member Author

fm3 commented Nov 9, 2018

I incorporated your feedback, thanks! :)
If you have any idea what’s up withe zipfile on mac, it would be appreciated

@jfrohnhofen
Copy link
Contributor

jfrohnhofen commented Nov 9, 2018

I committed a fix for macOS. Pleas have a look this does not break anything on your side.
How long does merging a 1024^3 cube take for you?
I had 2:30 minutes which I think is surprisingly long.
Most time is consumed by writing the new wkw file though, so maybe a problem in the library?
Otherwise looking good!

@fm3
Copy link
Member Author

fm3 commented Nov 9, 2018

timing is the same for me, I suppose that’s writing the compressed data?
I’ll test your fix next week (will have no internet over the weekend)

@jfrohnhofen
Copy link
Contributor

Ok, the timing is still weird. Reading the file should be faster than writing, but not 100 times faster!? Anyhow, not important for this PR, but depending on user feedback we could/should investigate a bit more.

@fm3 fm3 merged commit d173cc5 into master Nov 12, 2018
@fm3 fm3 deleted the merge-volume-tracing-on-disk branch November 12, 2018 13:30
jfrohnhofen added a commit that referenced this pull request Nov 22, 2018
* master:
  Fix rgb support (#3455)
  Fix docker uid/gid + binaryData permissions. Persist postgres db (#3428)
  Script to merge volume tracing into on-disk segmentation (#3431)
  Hotfix for editing TaskTypes (#3451)
  fix keyboardjs module (#3450)
  Fix guessed dataset boundingbox for non-zero-aligned datasets (#3437)
  voxeliterator now checks if the passed map has elements (#3405)
  integrate .importjs (#3436)
  Re-write logic for selecting zoom level and support non-uniform buckets per dimension (#3398)
  fixing selecting bug and improving style of layout dropdown (#3443)
  refresh screenshots (#3445)
  Reduce the free space between viewports in tracing (#3333)
  Scala linter and formatter (#3357)
  ignore reported datasets of non-existent organization (#3438)
  Only provide shortcut for tree search and not for comment search (#3407)
  Update Datastore+Tracingstore Standalone Deployment Templates (#3424)
  In yarn refresh-schema, also invalidate Tables.scala (#3430)
  Remove BaseDirService that watched binaryData symlinks (#3416)
  Ensure that resolutions array is dense (#3406)
  Fix bucket-collection related rendering bug (#3409)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Script to merge Volume Tracings into Segmentation Layer on disk
2 participants