Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add preliminary EyefulTower dataset support #2672

Merged
merged 25 commits into from
Jan 20, 2024

Conversation

VasuAgrawal
Copy link
Contributor

Summary

This PR adds preliminary support for the EyefulTower dataset. Specifically, the following main functionality:

  • Ability to download part or all of the dataset via the ns-download-data script, with selectable dataset (of 11) and resolution (of 4).
  • Automatic generation of nerfstudio-compatible transforms.json upon download, allowing each dataset to be loaded into memory

Usage

You can download a single EyefulTower dataset and resolution with basic download command. This will grab the riverview scene at 2k JPEG resolution.

$ ns-download-data eyefultower

You can use the flags to download more of the dataset, or just download it all.

$ ns-download-data eyefultower --capture-name apartment # Download the apartment scene with the default 2k JPEG resolution
$ ns-download-data eyefultower --capture-name riverview table kitchen --resolution-name jpeg_2k exr_2k # Download multiple datasets and resolutions at the same time
$ ns-download-data eyefultower --resolution-name all --capture-name all # Download everything

Data is downloaded using the aws s3 sync command as suggested in the EyefulTower documentation, which means redundant data will not be redownloaded. This makes it easy to start small with the dataset (e.g. running only ns-download-data eyefultower) and eventually working your way through the commands until you've downloaded everything.

Training

I've tested training on 3 scenes (riverview, office_view2, office1b), all using this command and parameters with the dataset swapped out:

ns-train nerfacto --data .\data\eyefultower\{dataset}\images-jpeg-2k\ \
    --pipeline.model.max-res 19912 \
    --pipeline.model.log2-hashmap-size 22 \
    --pipeline.model.far-plane 100 \
    --pipeline.datamanager.train-num-rays-per-batch 12800 \
    --max-num-iterations 100000 \
    --steps-per-eval-all-images 100000

All 3 datasets seemed to finish training and offer good results. See results section below.

Comments / questions / future work

Metashape: Originally, I tried using the ns-process-data script to convert from the cameras.xml files provided with Eyeful into something that nerfstudio understands. This doesn't work as-is though, since the Eyeful datasets use rig constraints, which the metashape processing in process_data.py doesn't understand. I made some progress towards implementing this and generated updated cameras.xml files for each of the downscaled datasets (which the original EyefulTower doesn't provide), but hit an issue when generating the transforms (since the exact tree isn't documented by metashape), and decided to just convert the existing cameras.json like I did. Adding support for rigs would be a great future addition.

EXR: I've generated the transforms.json file for the EXR images, but I haven't tried using them with any of the nerfstudio pipelines as I don't believe there's currently HDR training support. Assuming that's correct, I'm hoping that offering this dataset integration is a good first step towards adding HDR support in nerfstudio.

Fisheye: 5 of the 11 datasets from EyefulTower are taken with the V1 rig, which uses fisheye cameras. I've marked them as OPENCV_FISHEYE in the transforms.json, but I haven't tested those datasets as I'm not sure how good the fisheye support is. I see that some support was just merged in this PR a few hours ago. Perhaps fisheye support for EyefulTower can be a future PR?

Larger datasets: 6 of the 11 datasets from EyefulTower are taken with the V2 rig, and should in theory work fine with this code on a sufficiently powerful machine. On my workstation, though, I ran out of RAM when trying to load one of the larger scenes (apartment), so I stuck to the 3 smaller V2 scenes (riverview, office_view2, office1b). I believe the default dataloader tries to load the entire dataset into memory at once. Are there other dataloaders that can load the data in batches, rather than loading it all at once?

Automatic downscaling: The ns-process script automatically generates downsampled images. The method I took, directly generating the transforms.json, does not do that. This seems to work fine, at least for the smaller scenes as described above. I'm not sure exactly where the downsampled images are used, but I did notice that there's a line in the training output which says:

Auto image downscale factor of 1                                                 nerfstudio_dataparser.py:349

Does this indicate that there's automatic downscaling that could be applied at runtime, to perhaps scale down the larger datasets (e.g. apartment) into something smaller, e.g. 1k or 512, such that the entire (downscaled) dataset could fit into memory? This question assumes there's no batched dataloader we could use instead.

Splits: The EyefulTower dataset provides a splits.json file indicating how data should be partitioned into a train and test split. Currently, I don't use this file, as I'm not sure what format nerfstudio expects the splits in, or if there's support at all. If someone could point me in the right direction, I can try to add support. Right now, all the cameras, both train and test, are dumped into the transforms.json.

Subsampled data: In a similar vein to the splits file, as a way to try to reduce the amount of data that's loaded, I also generate a transforms_300.json and transforms_half.json, which (as you might've guessed) contain 300 images and half the dataset, respectively. I couldn't find a flag that would let me use these transforms_*.json files rather than the original transforms.json file in the generated datasets, but I'd love to know if one exists.

Results

403395954_364527902782016_1307748830219130665_n
403398631_1670570573749441_2998430273141668005_n
409725226_1011637236733330_660947310867314836_n

@VasuAgrawal
Copy link
Contributor Author

Also - I developed this on Windows. I had to add the change to disable pycolmap in order to run pip install -e .[dev] on Windows, but that doesn't seem to have broken any functionality as it was already disabled in the base environment.

I still wasn't able to run ns-dev-test though, as it seems to require running a shell script (.sh) which doesn't work in Windows.

(nerfstudio12) PS C:\Users\vasuagrawal\Documents\nerfstudio> ns-dev-test
Skipping actions/checkout@v3
Skipping Set up Python 3.8.13
Skipping actions/cache@v2
Skipping Install dependencies

───────────────────────────── Running: ./nerfstudio/scripts/licensing/license_headers.sh ; ─────────────────────────────
'.' is not recognized as an internal or external command,
operable program or batch file.
Error: `./nerfstudio/scripts/licensing/license_headers.sh ;` failed.
(nerfstudio12) PS C:\Users\vasuagrawal\Documents\nerfstudio>

@hturki
Copy link
Contributor

hturki commented Jan 9, 2024

@VasuAgrawal re: splits I think this is what you want: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L211

IMO it would be nice if the nerfstudio dataparser used provided dataset-provided splits by default before falling back to "fraction." Happy to make a PR if there are no objections.

@brentyi
Copy link
Collaborator

brentyi commented Jan 10, 2024

IMO it would be nice if the nerfstudio dataparser used provided dataset-provided splits by default before falling back to "fraction." Happy to make a PR if there are no objections.

This seems like a good idea to me!

@ethanweber
Copy link
Collaborator

ethanweber commented Jan 10, 2024

Agreed with using the filename split as well. :) I think we can also include a default method with default command line arguments, maybe with a method name "nerfacto-eyeful-tower"? I also think it would be cleaner if we move most of the code from "download_data.py" to a new file and only import the necessarily parts to connect it to the CLI. Cool work, excited to get this merged!

@hturki
Copy link
Contributor

hturki commented Jan 10, 2024

Turns out someone already thought of adding split support: https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L195 - my bad!

@VasuAgrawal
Copy link
Contributor Author

VasuAgrawal commented Jan 11, 2024

Updates

Thanks for the help so far everyone! A couple updates:

  • I've added support for the official EyefulTower splits, using the train split as train here (naturally), and the test set as val here. I haven't added a separate test (it's not clear what to use exactly), but that doesn't seem to be preventing training. An example from the command line indicating that it seems to be working:
$ ns-train nerfacto --data .\data\eyefultower\riverview\images-jpeg-2k\transforms_300.json --pipeline.model.max-res 19912 --pipeline.model.log2-hashmap-size 22 --pipeline.model.far-plane 100 --pipeline.datamanager.train-num-rays-per-batch 12800 --max-num-iterations 100000 --steps-per-eval-all-images 100000 --vis viewer+tensorboard

...

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
           Saving config to: outputs\images-jpeg-2k\nerfacto\2024-01-11_142548\config.yml       experiment_config.py:136
           Saving checkpoints to: outputs\images-jpeg-2k\nerfacto\2024-01-11_142548\nerfstudio_models     trainer.py:134
           Auto image downscale factor of 1                                                 nerfstudio_dataparser.py:389
            Dataset is overriding train_indices to [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,   nerfstudio_dataparser.py:204
           12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 
           32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 
           52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 
           72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 
           92, 93, 94, 95, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121,
           122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 
           138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 
           154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 
           170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 
           186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 
           202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 
           218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 
           234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 
           250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 
           266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 
           282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 
           298, 299]
[14:25:49]  Dataset is overriding val_indices to [96, 97, 98, 99, 100, 101, 102, 103, 104,  nerfstudio_dataparser.py:204
           105, 106, 107, 108]
            Dataset is overriding val_indices to [96, 97, 98, 99, 100, 101, 102, 103, 104,  nerfstudio_dataparser.py:204
           105, 106, 107, 108]
Loading data batch ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:03
  • I rebased these commits on top of main. This fixes a merge issue introduced by PR Add Mill 19 dataset #2742, which also adds a new dataset.
  • I found that the projectaria_tools[all] dependency doesn't work on Windows, since there are no Windows artifacts available. This is a known issue (per Pierre Moulon, one of the authors), and is due to one of the dependencies not supporting Windows. For the time being, I've just disabled the dependency on Windows, so that the pip install -e .[dev] command still works. I also cleaned up the format since the authors mentioned that [all] wasn't required, and that it should be projectaria-tools, not projectaria_tools (though the latter also seems to work?).
  • It turns out that I can pass in the direct path to a *.json file for the --data argument, rather than relying transforms.json being implicitly found, thanks to this handy block of code. This means we can now pass in any of the following for --data. The last two options are useful for testing with smaller datasets, either to help with iteration speed, or just due to memory constraints (until we can start using a batched dataloader).
    • --data .\data\eyefultower\riverview\images-jpeg-2k
    • --data .\data\eyefultower\riverview\images-jpeg-2k\transforms.json
    • --data .\data\eyefultower\riverview\images-jpeg-2k\transforms_300.json
    • --data .\data\eyefultower\riverview\images-jpeg-2k\transforms_half.json
  • I tested the apartment scene (one of the largest) with 2K resolution and all the images. This took a while to load (10+ minutes? I left it unattended after a while ...), and ended up taking ~130 GB RAM and ~12 GB VRAM to train, but did successfully train for a while (hours) before I killed it. While my workstation has the capacity for this, I don't think most people's do, so I'm chatting with Christian (another author) to see if we can generate a 1K resolution dataset for EyefulTower as well, to help with iteration on lower-end machines. As-is though, assuming linear scaling of the memory requirements, we should be able to go up to 4K training on a DGX A100 machine (80 GB VRAM, ~48 GB required, and 2 TB RAM, ~520 GB required), but won't be able to get to 8K naively (~192 GB VRAM required, ~2080 GB RAM required).
  • It seems like the camera poses are being optimized by default, which might not be the right call here, as the default poses seem pretty decent (in our experience with VR-NeRF). I'll add this to the next training run: --pipeline.model.camera-optimizer.mode off.
  • I also got TensorBoard working, which is nice:

image

awscli issues docs environment issue

Unfortunately, I've also discovered a small issue due to the new awscli dependency. Specifically, when trying to install the documentation environment with pip install -e .[docs] on Windows, there's now an irresolvable version mismatch between awscli and sphinx:

(nerfstudio12) PS C:\Users\vasuagrawal\Documents\nerfstudio> pip install -e .[docs]

...

ERROR: Cannot install None and awscli==1.31.10 because these package versions have conflicting dependencies.

The conflict is caused by:
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.31.10 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.16 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.15 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.14 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.13 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.12 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.11 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.10 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.9 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.8 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.7 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.6 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.5 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.4 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.3 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.2 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.1 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.32.0 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.31.13 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.31.12 depends on colorama<0.4.5 and >=0.2.5
    ipython 8.6.0 depends on colorama; sys_platform == "win32"
    sphinx 5.2.1 depends on colorama>=0.4.5; sys_platform == "win32"
    awscli 1.31.11 depends on colorama<0.4.5 and >=0.2.5

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Fortunately(?) I think this will only be an issue on Windows, as sphinx only uses colorama on Windows. If documentation isn't being built on Windows, maybe this is okay?

I've looked at a few other solutions, but none are fantastic:

  • Switch to the unofficial awscliv2 package, which doesn't seem to have a colorama dependency. There's intentionally no official V2 package from AWS, so we're stuck with an unofficial one. Unfortunately, this package also hasn't been updated since 2022.
  • Build awscliv2 from source, which would resolve the colorama dependency version issue (as the maximum version has been bumped to 0.4.7 for V2), but this means every developer now needs a C++ toolchain (among other dependencies), which is ... not fun.
  • Try to get AWS to bump the colorama maximum version on awscliv1, but this seems unlikely to happen given the various open issues like this one. Attention seems entirely focused on V2 now.
  • Try to get Sphinx to lower the the minimum required version of colorama. No specific rationale for the version bump was given in the upgrade commit, so maybe this is the most workable option?

I'd love to know what everyone else thinks!

@VasuAgrawal
Copy link
Contributor Author

I'll take a crack at splitting the updates into a separate file, and adding some documentation. I'm not sure about the dedicated command yet (i.e. ns-train nerfacto-eyeful-tower instead of ns-train nerfacto) since I think that sets a dangerous precedent for combinatorial explosion with Method X Dataset, but I agree that'd aid discoverability.

@VasuAgrawal
Copy link
Contributor Author

Per @ethanweber 's recommendation, I added a new method to method_configs.py with a copy of the nerfacto method, with a few updated settings as I've been pasting above.

@ethanweber
Copy link
Collaborator

Hey @VasuAgrawal, one thing we could do is make a super small addition to your other repo https://github.com/facebookresearch/EyefulTower, which would hook the custom nerfacto config as a method. If this is interesting to you, let me know. It would avoid the dangerous precedent with "Method X Dataset" as you said.

I think it would involve having two additional files like this...

eyefultower/nerfstudio_config.py
pyproject.toml

@VasuAgrawal
Copy link
Contributor Author

Some more updates:

  • As suggested by @ethanweber , I've split the EyefulTower download into a separate file. This also required separating out the DatasetDownload class into a separate file to avoid a circular import.
  • It turns out there's already some support for fisheye images in NerfStudio, with fisheye sampling added in this method in the recent PR adding support for Aria data. Use of this method is gated by the fisheye_crop_radius flag, e.g.:
ns-train nerfacto-eyeful-tower --data .\data\eyefultower\table\images-jpeg-2k\ --pipeline.datamanager.pixel-sampler.fisheye-crop-radius 0.4

However, I didn't want users to have to manually check which datasets were fisheye and which ones weren't, and what an appropriate fisheye-crop-radius setting was for each dataset. I instead added this to the transforms.json, and updated a little bit of the plumbing to allow the setting to propagate all the way forward. I'm not super happy with this solution since it doesn't show up in the config output in the command line, and requires updating the _broadcast_dict_fields methods to not overwrite metadata, whose implications I don't understand.

            pixel_sampler=PixelSamplerConfig(
                _target=<class 'nerfstudio.data.pixel_samplers.PixelSampler'>,
                num_rays_per_batch=4096,
                keep_full_image=False,
                is_equirectangular=False,
                fisheye_crop_radius=None
            ),

This is roughly the extent of the fisheye support I want to add in this PR, and it may be too much already due to the aforementioned concerns. There's a few other things that would be really nice to have, e.g. specifying crop radius by degrees instead of pixels (to transparently handle different zooms), allowing for per-camera/per-frame radii, ensuring the sampling doesn't potentially produce out of bounds indices (with a sufficiently large radius, or rectangular image like those in Eyeful, I think this happens right now, but I think they just become redundant values and change the sampling to be non-uniform?). These things can be added in future PRs :).

  • I'm not sure why my VSCode wants to nuke the formatting on all the imports. I'll fix that.

@VasuAgrawal
Copy link
Contributor Author

Here's a couple fisheye results, on seating_area and table. Things generally seem to train.

image
image

TensorBoard output also works with fisheye images, but I think there's some bugs in the pipeline somewhere as the renderings around the periphery don't look anywhere as good as those near the center. Not sure what's causing this right now. Also not sure how significant of a problem this is as the visualization the viewer seems to be okay?

image
image

@VasuAgrawal
Copy link
Contributor Author

VasuAgrawal commented Jan 16, 2024

Installed ruff locally and fixed the import issues. Looks like that was just added in #2765, and I hadn't updated my pip environment yet to reflect the new changes.

Copy link
Collaborator

@ethanweber ethanweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a pass with some comments. Looks close to being merged.

pyproject.toml Outdated
Comment on lines 95 to 96
# TODO(1480) enable when pycolmap windows wheels are available
# "pycolmap>=0.3.0", # NOTE: pycolmap==0.3.0 is not available on newer python versions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put this in a different PR, as it's not so related to adding support for EyefulTower.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can pull it into a separate PR. I put it here since I was doing my development on Windows and this was blocking me from installing the environment, but now that it's done I can easily make a separate PR. Coming soon.

pyproject.toml Outdated
Comment on lines 101 to 103
# NOTE: Disabling projectaria-tools because it doesn't have prebuilt windows wheels
# Syntax comes from here: https://pip.pypa.io/en/stable/reference/requirement-specifiers/
"projectaria-tools>=1.3.1; sys_platform != 'win32'",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not put this change in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@@ -62,12 +62,12 @@ def list_images(data: Path) -> List[Path]:
"""Lists all supported images in a directory

Args:
data: Path to the directory of images.
data: Path to the directory of images. Nested folders are searched as well.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should search nested folders by default. Can you make this by default turned off, and make a new parameter search_nested_folders: bool = False?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can do.

nerfstudio/scripts/downloads/download_data.py Show resolved Hide resolved
nerfstudio/scripts/downloads/eyeful_tower.py Outdated Show resolved Hide resolved
nerfstudio/scripts/downloads/eyeful_tower.py Show resolved Hide resolved
Comment on lines 144 to 146
else:
# Don't broadcast the remaining fields
new_dict[k] = v
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from the support for the fisheye mask radius parameter in the metadata dict that I added. I discussed it a bit in the discussion in the PR, quoted below:

I'm not super happy with this solution since it doesn't show up in the config output in the command line, and requires updating the _broadcast_dict_fields methods to not overwrite metadata, whose implications I don't understand.

In slightly more detail - it's because when creating the Cameras object (https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L292-L303), any populated metadata seems to get dropped due to the call through this function. Specifically, we end up calling this _broadcast_dict_fields method with dict_ = metadata, batch_shape = batch_shape, and then iterating through all the keys of the metadata. Since the one key that's currently set (https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L291) is not a torch.Tensor, TensorDataClass, or Dict, it silently falls through the if statement and gets dropped from the new output dictionary. The created Cameras object then gets a blank metadata dict instead of one containing fisheye_crop_radius (and any other metadata keys that might be added in the future).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay got it. I think this is fine and won't have downstream implications. I see there was already support for metadata and "fisheye_crop_radius", so why wasn't this needed before for the base_datamanager.py Cameras inititialization?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this, the metadata in Cameras was being overwritten to {}, which I think was a bug in the implementation. I think the reason it didn't get caught is because the value is (probably?) normally specified as a command line parameter, which takes a different path to populate the fisheye mask radius and doesn't exercise this code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also fixed this here: #2783 (which includes a test)

Copy link
Collaborator

@ethanweber ethanweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more comments.

nerfstudio/scripts/downloads/eyeful_tower.py Outdated Show resolved Hide resolved
nerfstudio/scripts/downloads/eyeful_tower.py Show resolved Hide resolved
nerfstudio/scripts/downloads/eyeful_tower.py Show resolved Hide resolved
@VasuAgrawal
Copy link
Contributor Author

I think I've addressed all the feedback. The other PRs will probably be in tomorrow-ish. Please take another look when you can and let me know if you'd like any additional changes.

Copy link
Collaborator

@ethanweber ethanweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments, and I think we can get it merged asap. Thanks!

Returns:
Paths to images contained in the directory
"""
allowed_exts = [".jpg", ".jpeg", ".png", ".tif", ".tiff"] + ALLOWED_RAW_EXTS
image_paths = sorted([p for p in data.glob("[!.]*") if p.suffix.lower() in allowed_exts])
glob = "**/[!.]*" if recursive else "[!.]*"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT but can you rename this to glob_str in case someone imports glob in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you got it, thanks!

Comment on lines 144 to 146
else:
# Don't broadcast the remaining fields
new_dict[k] = v
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay got it. I think this is fine and won't have downstream implications. I see there was already support for metadata and "fisheye_crop_radius", so why wasn't this needed before for the base_datamanager.py Cameras inititialization?

@ethanweber
Copy link
Collaborator

I tried it on my end, and I was able to download and train both a perspective and fisheye dataset. Once you merge main and fix the merge conflict and change glob to glob_str, it should be ready @VasuAgrawal.

@ethanweber
Copy link
Collaborator

I made these changes here https://github.com/nerfstudio-project/nerfstudio/tree/convert-cameras-json, but I didn't know how to push to your fork...

@VasuAgrawal
Copy link
Contributor Author

Thanks @ethanweber ! I just pulled your changes into my fork and pushed.

I made a separate fork and did the pull request though there since I didn't think I had write access directly to the Nerfstudio code, but if I do and you'd prefer I do any future commits directly as a nerfstudio branch for easier collaboration, I'm happy to do that.

@VasuAgrawal
Copy link
Contributor Author

Just for fun, I did confirm that the default EyefulTower scene trains with nerfacto, nerfacto-big, and nerfacto-huge and produces reasonable results. There's definitely loads of room for improvement, but it's good enough that I'm not going to worry about updating the EyefulTower repo with a specific recommended config for now.

30K iterations nerfacto:
image

~45K iterations nerfacto-big (killed early):
image

~30K iterations nerfacto-huge (killed early):
image

@ethanweber
Copy link
Collaborator

Hi @VasuAgrawal, can you change one last thing? If you can remove this comment that would be good. Someone must have beat us to making this update.

image

@ethanweber
Copy link
Collaborator

Using a fork worked just fine! Thanks for merging my changes.

@brentyi brentyi enabled auto-merge (squash) January 20, 2024 00:11
Copy link
Collaborator

@brentyi brentyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm too!

@brentyi brentyi merged commit 248e99e into nerfstudio-project:main Jan 20, 2024
4 checks passed
@VasuAgrawal VasuAgrawal deleted the convert-cameras-json branch January 20, 2024 21:54
ArpegorPSGH pushed a commit to ArpegorPSGH/nerfstudio that referenced this pull request Jun 22, 2024
* Add ability to download EyefulTower dataset

* wip before I copy linning's stuff in

* Generate per-resolution cameras.xml

* Generate transforms.json at download

* Fix a couple of quotes

* Use official EyefulTower splits for train and val

* Disable projectaria-tools on windows

* Fix extra imports

* Add a new nerfacto method tund for EyefulTower

* Split eyefultower download into a separate file

* Add some fisheye support for eyeful data

* Reformatted imports to not be dumb

* Apparently this file was missed when formatting originally

* Added 1k resolution scenes

* revert method_configs.py to original values

* Also add 1k exrs

* Address feedback

* Revert changes to pyproject.toml, to be added in a later PR

* Oops, probably shouldn't have gotten rid of awscli ...

* changed glob variable name

* Update tensor_dataclass.py

---------

Co-authored-by: Ethan Weber <[email protected]>
Co-authored-by: Brent Yi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants