Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add camera pose and projection gradient flow #123

Closed
wants to merge 8 commits into from

Conversation

jh-surh
Copy link

@jh-surh jh-surh commented Feb 8, 2024

  • Added gradient backward flow to project_gaussians_backward_kernel for viewmat and projmat
  • Added a test script examples/test_pose_grad.py to test pose gradient update. The script interface is the same as simple_trainer.py:
python examples/test_pose_grad.py
python examples/test_pose_grad.py --img_path path/to/image

@ichsan2895
Copy link

ichsan2895 commented Feb 8, 2024

python examples/test_pose_grad.py --img_path path/to/image needs pytorch3D. Maybe we add the requirement to setup.py?

@jh-surh
Copy link
Author

jh-surh commented Feb 8, 2024

@ichsan2895,
Good idea, but I think it may be better to add it to examples/requirements.txt

@kerrj kerrj requested a review from vye16 February 8, 2024 19:01
Pillow
pytorch3d @ git+https://github.com/facebookresearch/[email protected]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like only the tests rely on p3d, it would be nice to remove as a dependency since it's a pretty big one and if we only need it for matrix transforms there's more common lightweight options like scipy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I needed the matrix transforms to be differentiable so that the gradients can flow into the Euler angle and position parameters. I tried implementing my own transformations, but ran into some instability in training. I have been using se3_exp_map and se3_log_map, but if you have any replacements I am open to try them.

@kerrj
Copy link
Collaborator

kerrj commented Feb 8, 2024

It would be good to compare these camera grads to the camera grads from the torch implementation with autograd, this PR adds a torch implementation which should have correct (although slow) gradients since they're computed from torch.

@oseiskar
Copy link
Contributor

oseiskar commented Feb 9, 2024

Also this would be simpler without the redundant projmat (see #97). For camera calibration optimization, you would very likely want to optimize fx, fy, cx, cy (and possibly other distortion parameters if support to such are added) and not the redudant OpenGL/NDC-style projection matrix.

@oseiskar
Copy link
Contributor

oseiskar commented Feb 9, 2024

There's also a simpler way of achieving the same goal of optimizing the view matrix, shown here: #127

@jh-surh
Copy link
Author

jh-surh commented Feb 14, 2024

Also this would be simpler without the redundant projmat (see #97). For camera calibration optimization, you would very likely want to optimize fx, fy, cx, cy (and possibly other distortion parameters if support to such are added) and not the redudant OpenGL/NDC-style projection matrix.

I'm not certain if removing the projmat is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step.
What do you guys think? @ichsan2895 @kerrj

@jh-surh jh-surh force-pushed the jhsurh/add-pose-grad branch from 7540e37 to d4fd07a Compare February 18, 2024 15:07
@oseiskar
Copy link
Contributor

I'm not certain if removing the projmat is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step. What do you guys think? @ichsan2895 @kerrj

@jh-surh I elaborated this here #97 (comment). Everything that can be implemented with projection matrices can also be implemented by extending the "intrinsics" to support other camera models than the ideal pinhole (fx,fy,cx,cy). It's also relatively simple to convert any typical (non-ortho) OpenGL projection matrix to (fx,fy,cx,cy) format (and throw away the near & far clip terms which do not matter in this context)

@jh-surh
Copy link
Author

jh-surh commented Feb 19, 2024

I'm not certain if removing the projmat is a good idea. I think it may add some nice functionality in choosing the type of projection you want during the gaussian projection step. What do you guys think? @ichsan2895 @kerrj

@jh-surh I elaborated this here #97 (comment). Everything that can be implemented with projection matrices can also be implemented by extending the "intrinsics" to support other camera models than the ideal pinhole (fx,fy,cx,cy). It's also relatively simple to convert any typical (non-ortho) OpenGL projection matrix to (fx,fy,cx,cy) format (and throw away the near & far clip terms which do not matter in this context)

Would it require work "extending the intrinsics to support other camera models" in terms of calculating gradients? If so, it seems like something we can do in a separate PR since the current nerfstudio implementation uses projmat for its own projection matrix.

It would be good to compare these camera grads to the camera grads from the torch implementation with autograd, this PR adds a torch implementation which should have correct (although slow) gradients since they're computed from torch.

Thank you for your suggestion! I found a bug trying to compare the gradient w.r.t. the torch implementation. I have fixed it and updated the test script and have passed the project gaussian test. Seems like the current main branch is failing 3 of the test scripts though. Someone should look at that.

@oseiskar
Copy link
Contributor

Would it require work "extending the intrinsics to support other camera models" in terms of calculating gradients? If so, it seems like something we can do in a separate PR since the current nerfstudio implementation uses projmat for its own projection matrix.

Yes, and I agree that such stuff should definitely be implemented in some other PR.

However, the projection matrix is fed to gsplat is not needed by Nersftudio for any other purpose, as demonstrated here SpectacularAI/nerfstudio@bd23489 (works fine with #97).

My argument is that the more stuff is built on top of the current API with separate and redundant "projmat" + fx,fy,cx,cy, the more difficult it becomes to simplify it. I now remembered/realized the situation with the redundant API is worse than I described earlier: "projmat" is actually not a "projection matrix" in any standard sense but a model-view-projection matrix, adding another layer of confusion).

This complications caused by this API are clearly visible in this PR: why you need to compute the "projection matrix" gradient for pose optimization in the first place is because the parameter known as projmat is set to projmat @ viewmat in Nerfstudio, and this term depends on viewmat, which is the thing you actually want to modify in pose optimization.

Without the @ viewmat part, projection matrix gradients would only be needed for camera instrinsics optimization, but for this purpose, you would also need to add gradients w.r.t. fx, fy, cx, cy to work with the current API. Without "projmat", you would only need the latter.

@jh-surh jh-surh requested a review from kerrj February 21, 2024 05:48
@jh-surh
Copy link
Author

jh-surh commented Feb 21, 2024

My argument is that the more stuff is built on top of the current API with separate and redundant "projmat" + fx,fy,cx,cy, the more difficult it becomes to simplify it. I now remembered/realized the situation with the redundant API is worse than I described earlier: "projmat" is actually not a "projection matrix" in any standard sense but a model-view-projection matrix, adding another layer of confusion).

This complications caused by this API are clearly visible in this PR: why you need to compute the "projection matrix" gradient for pose optimization in the first place is because the parameter known as projmat is set to projmat @ viewmat in Nerfstudio, and this term depends on viewmat, which is the thing you actually want to modify in pose optimization.

Without the @ viewmat part, projection matrix gradients would only be needed for camera instrinsics optimization, but for this purpose, you would also need to add gradients w.r.t. fx, fy, cx, cy to work with the current API. Without "projmat", you would only need the latter.

I gotta say this makes a lot of sense. Now that I think of it, this would cause the gradient for projmat to flow to viewmat, which I'm not sure is the right move.

@kerrj
Copy link
Collaborator

kerrj commented Mar 20, 2024

I'm thinking of merging #127 because of how much simpler it is and it seems to give good results, @jh-surh thoughts?

@maturk
Copy link
Collaborator

maturk commented Mar 20, 2024

I'm thinking of merging #127 because of how much simpler it is and it seems to give good results, @jh-surh thoughts?

I agree. Both are equivalent to the best of my understanding.

@jh-surh
Copy link
Author

jh-surh commented Mar 21, 2024

@kerrj @maturk @oseiskar I left a comment on #127.
TL;DR large difference between pure pytorch gradient and approximated v_viewmat.

@kerrj
Copy link
Collaborator

kerrj commented Jun 27, 2024

Closing this since the 1.0 version includes more exact gradients I believe

@kerrj kerrj closed this Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants