Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert (self.num_tiles_hit > 0).any() #20

Open
pablovela5620 opened this issue Apr 18, 2024 · 0 comments
Open

assert (self.num_tiles_hit > 0).any() #20

pablovela5620 opened this issue Apr 18, 2024 · 0 comments

Comments

@pablovela5620
Copy link
Contributor

pablovela5620 commented Apr 18, 2024

I'm finding an obscure error that I'm not fully sure what it means, I'm using a self-made dataset that comes from the phones truedepth camera, but keep hitting this problem.

I'm making sure that the depth frame is saved as a uint16 mm range .png file, if I run just the base splatfacto model I don't find this error. Here is the link to the example data https://huggingface.co/datasets/pablovela5620/sample-polycam-room/blob/main/facescan-nerfstudio.zip

[18:25:04] Caching / undistorting train images                                            full_images_datamanager.py:179
Printing profiling stats, from longest to shortest duration in seconds
Trainer.train_iteration: 2.2812              
VanillaPipeline.get_train_loss_dict: 2.2803              
Traceback (most recent call last):
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/bin/ns-train", line 8, in <module>
    sys.exit(entrypoint())
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/scripts/train.py", line 262, in entrypoint
    main(
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/scripts/train.py", line 247, in main
    launch(
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/scripts/train.py", line 189, in launch
    main_func(local_rank=0, world_size=world_size, config=config)
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/scripts/train.py", line 100, in train_loop
    trainer.train()
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/engine/trainer.py", line 250, in train
    loss, loss_dict, metrics_dict = self.train_iteration(step)
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/utils/profiler.py", line 112, in inner
    out = func(*args, **kwargs)
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/engine/trainer.py", line 471, in train_iteration
    _, loss_dict, metrics_dict = self.pipeline.get_train_loss_dict(step=step)
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/utils/profiler.py", line 112, in inner
    out = func(*args, **kwargs)
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/pipelines/base_pipeline.py", line 301, in get_train_loss_dict
    model_outputs = self._model(ray_bundle)  # train distributed data parallel model if world_size > 1
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/pablo/0Dev/repos/dn-splatter/.pixi/envs/default/lib/python3.10/site-packages/nerfstudio/models/base_model.py", line 143, in forward
    return self.get_outputs(ray_bundle)
  File "/home/pablo/0Dev/repos/dn-splatter/dn_splatter/dn_model.py", line 543, in get_outputs
    assert (self.num_tiles_hit > 0).any()  # type: ignore
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

The command I ran

ns-train dn-splatter \
    --max-num-iterations 5001 \
    --pipeline.model.use-depth-loss True \
    --pipeline.model.sensor-depth-lambda 0.2 \
    --pipeline.model.use-depth-smooth-loss True \
    --pipeline.model.use-normal-loss True \
    --pipeline.model.normal-supervision mono \
    normal-nerfstudio --data datasets/polycam/nerfstudio --normal-format opencv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant