Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about the code? #14

Open
booker-max opened this issue Sep 5, 2023 · 1 comment
Open

Some questions about the code? #14

booker-max opened this issue Sep 5, 2023 · 1 comment

Comments

@booker-max
Copy link

**What a great job this is!

I have some problems, mainly inside EFT

  1. In the following code, what is the shape of data["R"] and why [0] was chosen? In the Using Custom Datasets section of your readme, you mentioned "'R': (B, 3, 3 ) PyTorch3D rotation," so why choose [0]**

target_cameras = PerspectiveCameras(R=data['R'][0],T=data['T'][0],focal_length=data['f'][0],principal_point=data['c'][0],image_size=data['image_size'][0]).cuda(gpu)
target_rgb = data['images'][0].cuda(gpu)

** 2. I don't really understand what sample batch cameras do, what does the code mean by setting render_batch_size=1 and query_idx? I'm not sure I understand. Don't you just need to input context_size, a reference pose and reference_image, and a target_pose to output a target_image?**

rand_batch = torch.randperm(len(target_cameras))
batch_idx = rand_batch[:render_batch_size], 
batch_cameras, batch_rgb, batch_mask, input_cameras, input_rgb, input_masks, context_idx = relative_cam(target_cameras, target_rgb, context_size=context_size, query_idx=batch_idx, return_context=True)
@zhizdev
Copy link
Owner

zhizdev commented Sep 5, 2023

Hi, thanks for looking at the code.

  1. Right out of the dataloader, the shape of data['R'] is (1, B, 3, 3) since we load one scene per iteration.

  2. Render batch size is the batch size for target_pose and target_image. The forward for relative_cam takes in all cameras and pose from dataloader, and based on batch_idx == query_idx, it picks of the target_image and target_pose. It also randomly selects the context_image and context_pose.

In the output of the last line in your code box can be interpreted as this:

  1. batch_cameras ~ target poses
  2. batch_rgb ~ target images
  3. input_cameras ~ context poses
  4. input_rgb ~ context images

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants