bugfix: handle multiple camera types in a single batch #3415

decrispell · 2024-09-05T17:51:42Z

Training with mixed batches of camera types seems to be broken. The camera types are looped over, but the same type will trigger the if/else each time through the loop in its current form, leaving uninitialized rays for the other camera types.

This MR fixes that, enabling support for mixed batches of rays. My guess is this was the original intent of the code.

jb-ye · 2024-09-05T17:58:25Z

The original code does seem buggy to me. @cvachha Could you take a look at this PR fix?

cvachha · 2024-09-07T21:21:02Z

Thanks for the fix! I think this code change looks good, however I am unable to validate it working on multi-camera input and am not familiar with the initial reason why that loop was originally written that way. I am requesting another reviewer to take a look.

brentyi · 2024-09-08T23:35:32Z

nerfstudio/cameras/cameras.py

@@ -778,15 +778,15 @@ def _compute_rays_for_vr180(

            return vr180_origins, directions_stack

-        for cam in cam_types:
-            if CameraType.PERSPECTIVE.value in cam_types:


As an alternative fix, would it make sense to:

Remove the for cam in cam_types

De-dent all of the contents

Change all of the if to elif?

I went through this file history and it seems like that's how these conditions were originally set up:

nerfstudio/nerfstudio/cameras/cameras.py

Lines 646 to 681 in d413ca4

if CameraType.PERSPECTIVE.value in cam_types:

mask = (self.camera_type[true_indices] == CameraType.PERSPECTIVE.value).squeeze(-1) # (num_rays)

mask = torch.stack([mask, mask, mask], dim=0)

directions_stack[..., 0][mask] = torch.masked_select(coord_stack[..., 0], mask).float()

directions_stack[..., 1][mask] = torch.masked_select(coord_stack[..., 1], mask).float()

directions_stack[..., 2][mask] = -1.0

if CameraType.FISHEYE.value in cam_types:

mask = (self.camera_type[true_indices] == CameraType.FISHEYE.value).squeeze(-1) # (num_rays)

mask = torch.stack([mask, mask, mask], dim=0)

theta = torch.sqrt(torch.sum(coord_stack**2, dim=-1))

theta = torch.clip(theta, 0.0, math.pi)

sin_theta = torch.sin(theta)

directions_stack[..., 0][mask] = torch.masked_select(coord_stack[..., 0] * sin_theta / theta, mask).float()

directions_stack[..., 1][mask] = torch.masked_select(coord_stack[..., 1] * sin_theta / theta, mask).float()

directions_stack[..., 2][mask] = -torch.masked_select(torch.cos(theta), mask).float()

if CameraType.EQUIRECTANGULAR.value in cam_types:

mask = (self.camera_type[true_indices] == CameraType.EQUIRECTANGULAR.value).squeeze(-1) # (num_rays)

mask = torch.stack([mask, mask, mask], dim=0)

# For equirect, fx = fy = height = width/2

# Then coord[..., 0] goes from -1 to 1 and coord[..., 1] goes from -1/2 to 1/2

theta = -torch.pi * coord_stack[..., 0] # minus sign for right-handed

phi = torch.pi * (0.5 - coord_stack[..., 1])

# use spherical in local camera coordinates (+y up, x=0 and z<0 is theta=0)

directions_stack[..., 0][mask] = torch.masked_select(-torch.sin(theta) * torch.sin(phi), mask).float()

directions_stack[..., 1][mask] = torch.masked_select(torch.cos(phi), mask).float()

directions_stack[..., 2][mask] = torch.masked_select(-torch.cos(theta) * torch.sin(phi), mask).float()

for value in cam_types:

if value not in [CameraType.PERSPECTIVE.value, CameraType.FISHEYE.value, CameraType.EQUIRECTANGULAR.value]:

raise ValueError(f"Camera type {value} not supported.")

Which makes sense to me. The current fix also seems fine though!

That's actually the way I originally wrote the fix, but decided to fix the loop instead since it seemed more in line with the intentions of the original author.

I don't have a strong opinion on which way is better, but I don't have a way to test anything other than fisheye and perspective cases so I figured the smaller change was safer.

makes sense, thanks @decrispell!

bugfix: handle multiple camera types in a single batch

c2ad294

jb-ye requested a review from cvachha September 5, 2024 17:57

Merge branch 'main' into mixedcam_bugfix

4a00448

cvachha requested a review from brentyi September 7, 2024 21:21

brentyi reviewed Sep 8, 2024

View reviewed changes

Merge branch 'main' into mixedcam_bugfix

4da56ac

brentyi approved these changes Sep 10, 2024

View reviewed changes

jb-ye merged commit e2a0915 into nerfstudio-project:main Sep 10, 2024
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix: handle multiple camera types in a single batch #3415

bugfix: handle multiple camera types in a single batch #3415

decrispell commented Sep 5, 2024

jb-ye commented Sep 5, 2024

cvachha commented Sep 7, 2024

brentyi Sep 8, 2024

decrispell Sep 9, 2024

brentyi Sep 10, 2024

	if CameraType.PERSPECTIVE.value in cam_types:
	mask = (self.camera_type[true_indices] == CameraType.PERSPECTIVE.value).squeeze(-1) # (num_rays)
	mask = torch.stack([mask, mask, mask], dim=0)
	directions_stack[..., 0][mask] = torch.masked_select(coord_stack[..., 0], mask).float()
	directions_stack[..., 1][mask] = torch.masked_select(coord_stack[..., 1], mask).float()
	directions_stack[..., 2][mask] = -1.0

	if CameraType.FISHEYE.value in cam_types:
	mask = (self.camera_type[true_indices] == CameraType.FISHEYE.value).squeeze(-1) # (num_rays)
	mask = torch.stack([mask, mask, mask], dim=0)

	theta = torch.sqrt(torch.sum(coord_stack**2, dim=-1))
	theta = torch.clip(theta, 0.0, math.pi)

	sin_theta = torch.sin(theta)

	directions_stack[..., 0][mask] = torch.masked_select(coord_stack[..., 0] * sin_theta / theta, mask).float()
	directions_stack[..., 1][mask] = torch.masked_select(coord_stack[..., 1] * sin_theta / theta, mask).float()
	directions_stack[..., 2][mask] = -torch.masked_select(torch.cos(theta), mask).float()

	if CameraType.EQUIRECTANGULAR.value in cam_types:
	mask = (self.camera_type[true_indices] == CameraType.EQUIRECTANGULAR.value).squeeze(-1) # (num_rays)
	mask = torch.stack([mask, mask, mask], dim=0)

	# For equirect, fx = fy = height = width/2
	# Then coord[..., 0] goes from -1 to 1 and coord[..., 1] goes from -1/2 to 1/2
	theta = -torch.pi * coord_stack[..., 0] # minus sign for right-handed
	phi = torch.pi * (0.5 - coord_stack[..., 1])
	# use spherical in local camera coordinates (+y up, x=0 and z<0 is theta=0)
	directions_stack[..., 0][mask] = torch.masked_select(-torch.sin(theta) * torch.sin(phi), mask).float()
	directions_stack[..., 1][mask] = torch.masked_select(torch.cos(phi), mask).float()
	directions_stack[..., 2][mask] = torch.masked_select(-torch.cos(theta) * torch.sin(phi), mask).float()

	for value in cam_types:
	if value not in [CameraType.PERSPECTIVE.value, CameraType.FISHEYE.value, CameraType.EQUIRECTANGULAR.value]:
	raise ValueError(f"Camera type {value} not supported.")

bugfix: handle multiple camera types in a single batch #3415

bugfix: handle multiple camera types in a single batch #3415

Conversation

decrispell commented Sep 5, 2024

jb-ye commented Sep 5, 2024

cvachha commented Sep 7, 2024

brentyi Sep 8, 2024

Choose a reason for hiding this comment

decrispell Sep 9, 2024

Choose a reason for hiding this comment

brentyi Sep 10, 2024

Choose a reason for hiding this comment