-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Depth supervision for nerfacto #1173
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I added a few comments, mostly on missing documentation. Two additional things, 1. can you provide some example results in this PR, ie a scene trained with and without DS to verify that it is working as expected. 2. Can you add some documentation with respect to the expected format of the depth images. This can be added to some of the docstrings, but also it would make sense to add it to this page - https://github.com/nerfstudio-project/nerfstudio/blob/main/docs/quickstart/data_conventions.md
Specifics such as, num channels, data format (uint8 vs 16 bit), is it "distance" vs "depth", what resolution it should be, how to account for sparsity (from lidar), how to specify path in json.
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
"""Depth data parser.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is following the same format at the nerfstudio data, what are your thoughts on adding a if "depth_file_path" in frame
logic to the nerfstudio-dataparser (similar to what we do with masks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea! Went with this approach instead.
nerfstudio/data/utils/data_utils.py
Outdated
) -> torch.Tensor: | ||
"""Loads, rescales and resizes depth images. Assumes Record3D depth format.""" | ||
image = cv2.imread(str(filepath.absolute()), cv2.IMREAD_ANYDEPTH) | ||
image = image.astype(np.float64) * _MILLIMETER_TO_METER_SCALE_FACTOR * scene_scale_factor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this scale factor be a variable? If not, it should be well documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I made it a variable instead in the NerfstudioDataParserConfig
.
Thanks for the review @tancik! I believe I addressed all of the comments. Please take a look when you can. I also added some results to the PR description and updated the docs. Happy to elaborate more if you have some comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for making the changes
* Implement depth supervision * Fixes * Change normalize dtype * refactor * Change depth weight * Small refactor * Change docstring * Removu unused import * Add assumption to depth comment * Implement URF loss * Remove default depth loss type param * URF fixes * Fix depth loss scale * Visualize gt depths * Fixes & documentation * Documentation and formatting * Fix formatting problems * Fix linter problems * Fix tests * Fix tests
* Implement depth supervision * Fixes * Change normalize dtype * refactor * Change depth weight * Small refactor * Change docstring * Removu unused import * Add assumption to depth comment * Implement URF loss * Remove default depth loss type param * URF fixes * Fix depth loss scale * Visualize gt depths * Fixes & documentation * Documentation and formatting * Fix formatting problems * Fix linter problems * Fix tests * Fix tests
@mpmisko Hi, I was trying depth-nerfacto with my blender dataset, but it raises errors. Does depth-nerfacto only support the nerfstudio dataset? Thank you! |
|
* Implement depth supervision * Fixes * Change normalize dtype * refactor * Change depth weight * Small refactor * Change docstring * Removu unused import * Add assumption to depth comment * Implement URF loss * Remove default depth loss type param * URF fixes * Fix depth loss scale * Visualize gt depths * Fixes & documentation * Documentation and formatting * Fix formatting problems * Fix linter problems * Fix tests * Fix tests
Depth supervision for `depth-nerfacto` was added in nerfstudio-project#1173. This patch extends that support to `sdfstudio` datasets ([as originally intended](nerfstudio-project#1173 (comment))) by adding `depth_unit_scale_factor` to `sdfstudio` metadata.
Depth supervision for `depth-nerfacto` was added in #1173. This patch extends that support to `sdfstudio` datasets ([as originally intended](#1173 (comment))) by adding `depth_unit_scale_factor` to `sdfstudio` metadata.
Hi @mpmisko , Thanks for the amazing contribution. Could you please provide the dataset/scene you used for evaluation? In my case, I'm using a scene from Scannet: depth loss increases, depth visualizations look okay, but not great. Unprojections: left [using nerf depth] , right [sensor depth] |
Implements depth-supervised Nerfacto based on with depth losses from Depth-supervised NeRF and Urban Radiance Fields. I found the Depth-supervised NeRF loss to perform better for my dataset, however, Urban Radiance Fields reports it is SOTA, so it may work better for other use cases. I set the Depth-supervised NeRF loss as default with some reasonable initial hyperparameters.
I tested the method in an indoor livingroom scene. The additional depth loss helps with depths (see depth_mse), however slightly hurts other eval metrics:
Example depth colormaps (left is nerfacto & right is nerfacto with depth supervision, left image is ground truth depth in both cases).
I generally find that the additional depth supervision removes floaters or makes the floaters more diffuse in the worst case. However, it seems to me that nerfacto is slightly sharper around objects, which could be explained by noise in depth measurements. Depth supervision will be probably be helpful on a case-by-case basis.