Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About .pkl of Smplify-x and SMPLX #10

Open
IceFtv opened this issue Jun 3, 2024 · 21 comments
Open

About .pkl of Smplify-x and SMPLX #10

IceFtv opened this issue Jun 3, 2024 · 21 comments

Comments

@IceFtv
Copy link

IceFtv commented Jun 3, 2024

Very Nice job!
But when I tried to train my model using the pkl file processed by smplify-x, I found that there is a certain difference between the. pkl output of Smplify-x and SMPLX.

Smplify-x:
{'camera_rotation', 'camera_translation', 'betas', 'global_orient', 'left_hand_pose', 'right_hand_pose', 'jaw_pose', 'leye_pose', 'reye_pose', 'expression', 'body_pose'}
SMPLX:
{'betas', 'global_orient', 'body_pose', 'transl', 'left_hand_pose', 'right_hand_pose', 'jaw_pose', 'leye_pose', 'reye_pose', 'expression', 'camera_matrix', 'camera_transform'}

Could you tell me how you handled it at that time. Thank you very much.

@david-svitov
Copy link
Owner

from smplx.lbs import transform_mat

size_x = gt_masks_list[frame_id].shape[1]
size_y = gt_masks_list[frame_id].shape[0]

camere_matrix = camera.get_camera_matrix("cpu", size_x, size_y)[0]
camera_transform = transform_mat(camera.rotation,
                                 camera.translation.unsqueeze(dim=-1))[0]

result['camera_matrix'] = camere_matrix.detach().numpy()
result['camera_transform'] = camera_transform.detach().cpu().numpy()

Hi! I added this to fit_single_frame.py somewhere after line 470. But you can assemble these matrices in a dataloader using 'camera_rotation' and 'camera_translation'.

@IceFtv
Copy link
Author

IceFtv commented Jun 4, 2024

Thank you very much for your reply. I will try it immediately.
But I found another issue, which is that the number of body_pose in smplifyx in the. pkl file is 32, while the number of SMPLX is 63
Could you tell me how you handled it? Thank you very much.

@david-svitov
Copy link
Owner

It seems you need to decode the vector from vposer:
vchoutas/smplify-x#139
result['body_pose'] = vposer.decode(pose_embedding, output_type='aa').detach().cpu().numpy().reshape((1,63))

@IceFtv
Copy link
Author

IceFtv commented Jun 5, 2024

from smplx.lbs import transform_mat

size_x = gt_masks_list[frame_id].shape[1]
size_y = gt_masks_list[frame_id].shape[0]

camere_matrix = camera.get_camera_matrix("cpu", size_x, size_y)[0]
camera_transform = transform_mat(camera.rotation,
                                 camera.translation.unsqueeze(dim=-1))[0]

result['camera_matrix'] = camere_matrix.detach().numpy()
result['camera_transform'] = camera_transform.detach().cpu().numpy()

Hi! I added this to fit_single_frame.py somewhere after line 470. But you can assemble these matrices in a dataloader using 'camera_rotation' and 'camera_translation'.

Is the function(camera.get_camera_matrix) self defined? Could you provide me with the code? And if I have my own camera parameters, do I need to modify the content of. pkl or other places to display it correctly. Thank you very much.

@zyms5244
Copy link

zyms5244 commented Jun 5, 2024

from smplx.lbs import transform_mat

size_x = gt_masks_list[frame_id].shape[1]
size_y = gt_masks_list[frame_id].shape[0]

camere_matrix = camera.get_camera_matrix("cpu", size_x, size_y)[0]
camera_transform = transform_mat(camera.rotation,
                                 camera.translation.unsqueeze(dim=-1))[0]

result['camera_matrix'] = camere_matrix.detach().numpy()
result['camera_transform'] = camera_transform.detach().cpu().numpy()

Hi! I added this to fit_single_frame.py somewhere after line 470. But you can assemble these matrices in a dataloader using 'camera_rotation' and 'camera_translation'.

I have encountered the same problem when I want to convert my dataset into this format: https://drive.google.com/file/d/1peE2RNuYoeouA8YS0XwyR2YEbLT5gseW/view?usp=sharing Could you explain the data processing process in detail, especially get camera_matrix from Smplify-x? It would be even better if could provide the code.

@david-svitov
Copy link
Owner

david-svitov commented Jun 5, 2024

In the camera.py add this to the PerspectiveCamera class:

NEAR = 0.01
FAR = 100

def get_camera_matrix(self, device, size_x, size_y):
    with torch.no_grad():
        camera_mat = torch.zeros([self.batch_size, 4, 4],
                                 dtype=self.dtype, device=device)
        camera_mat[:, 0, 0] = 2.0 * self.focal_length_x / size_x
        camera_mat[:, 1, 1] = 2.0 * self.focal_length_y / size_y
        camera_mat[:, 3, 2] = 1.0
        camera_mat[:, 2, 2] = -(self.FAR + self.NEAR) / (self.NEAR - self.FAR)
        camera_mat[:, 2, 3] = (2 * self.FAR * self.NEAR) / (self.NEAR - self.FAR)

    return camera_mat

@IceFtv
Copy link
Author

IceFtv commented Jun 5, 2024

Thank you for your reply. Are size_x and size_y the W and H of img here?

@david-svitov
Copy link
Owner

Yup

@IceFtv
Copy link
Author

IceFtv commented Jun 5, 2024

Another question is, how should SMPLX's “tranl” be calculated? Does it mean it is the "camera_translation" in Smplify-x.

@david-svitov
Copy link
Owner

Everything should be fine if you just ignore "transl".

@zyms5244
Copy link

zyms5244 commented Jun 6, 2024

In the camera.py add this to the PerspectiveCamera class:

NEAR = 0.01
FAR = 100

def get_camera_matrix(self, device, size_x, size_y):
    with torch.no_grad():
        camera_mat = torch.zeros([self.batch_size, 4, 4],
                                 dtype=self.dtype, device=device)
        camera_mat[:, 0, 0] = 2.0 * self.focal_length_x / size_x
        camera_mat[:, 1, 1] = 2.0 * self.focal_length_y / size_y
        camera_mat[:, 3, 2] = 1.0
        camera_mat[:, 2, 2] = -(self.FAR + self.NEAR) / (self.NEAR - self.FAR)
        camera_mat[:, 2, 3] = (2 * self.FAR * self.NEAR) / (self.NEAR - self.FAR)

    return camera_mat

Thank you for your kindness. I have verified the data preprocessing on male-4-casual using Simplify-X and added your code in camera.py. However, I could not achieve the same results with the data you provided:[ Google Drive link](https://drive.google.com/file/d/1peE2RNuYoeouA8YS0XwyR2YEbLT5gseW/view?usp=sharing).
Regarding your data:
Key: camera_matrix : [[ 4.9223537 0. -0.08888889 0. ] [ 0. 4.941477 0.05 0. ] [ 0. 0. 1.0001 -0.01010101] [ 0. 0. 1. 0. ]] Key: camera_transform : [[1. 0. 0. 0.] [0. 1. 0. 0.] [0. 0. 1. 0.] [0. 0. 0. 1.]] Key: transl : [[0.1355158 0.29134032 5.315963 ]]
my processed data:
Key: camera_matrix : [[ 9.259259 0. 0. 0. ] [ 0. 9.259259 0. 0. ] [ 0. 0. 1.0002 -0.020002] [ 0. 0. 1. 0. ]] Key: camera_transform : [[1. 0. 0. 0.0371261] [0. 1. 0. 0.3118709] [0. 0. 1. 8.512359 ] [0. 0. 0. 1. ]] Key: transl : [[0. 0. 0.]]
The matrices are different, and the avatar reconstruction is blurry. Could you provide some advice or update the entire data preprocessing code?

@IceFtv
Copy link
Author

IceFtv commented Jun 6, 2024

I also have the same problem, I think it's because smplify-x has a default focal_length=5000, so I give smplify_x a focal_length which comes from the camera.pkl.

@IceFtv
Copy link
Author

IceFtv commented Jun 6, 2024

from smplx.lbs import transform_mat

size_x = gt_masks_list[frame_id].shape[1]
size_y = gt_masks_list[frame_id].shape[0]

camere_matrix = camera.get_camera_matrix("cpu", size_x, size_y)[0]
camera_transform = transform_mat(camera.rotation,
                                 camera.translation.unsqueeze(dim=-1))[0]

result['camera_matrix'] = camere_matrix.detach().numpy()
result['camera_transform'] = camera_transform.detach().cpu().numpy()

Hi! I added this to fit_single_frame.py somewhere after line 470. But you can assemble these matrices in a dataloader using 'camera_rotation' and 'camera_translation'.

Hi! Sorry to bother you again. When I tried to train my model using the preprocessed code above, I obtained a very bad result:
00000
And I found that during the training process, the human body in the images saved in the val folder may be smaller than the human body trained using the dataset data.
Use Smplify-x:
s000000_b000
Use SnapshotPeople_SMPLX:
s000000_b000_1

Is this caused by camera_translation estimated by smplify-x? How to solve it? I would greatly appreciate it if you could reply to me.

@david-svitov
Copy link
Owner

First, it's good to understand whether there's a problem with Smplify-x convergence or the data format.

  1. There should be visualizations in the folder with Simplify-x results if you set the --visualize="True" flag. Check that they are ok.
  2. This kind of problem often arises if the gender of a person is confused. Since you are trying on a woman, set the value here to "female": https://github.com/vchoutas/smplify-x/blob/68f8536707f43f4736cdd75a19b18ede886a4d53/cfg_files/fit_smplx.yaml#L11

@david-svitov
Copy link
Owner

@zyms5244
If the problem is only blurriness, then the problem is not in the camera, but in the quality of the fits. First check the things I pointed out in the post above.
You'll likely see in your renderings that Smplify-x doesn't work well for some frames. For example, it restores the pose inaccurately. In this case, you can modify the Smplify-x code as suggested in this article: https://samsunglabs.github.io/MoRF-project-page/
Namely: add Silhouette loss and Temporal loss
In the next few days I will try to upload a non-official implementation of the fitting from MoRF article. It's a little slow, but more accurate than Smplify-x.

@zyms5244
Copy link

zyms5244 commented Jun 7, 2024

Thanks. I reproduced the effect of male4 with my processed data, and it seems that the position of the smplx model and the 3DGS are not well aligned.
https://github.com/david-svitov/HAHA/assets/10354474/d02478fe-44c4-4db4-adf6-a81a78968a3a

now, I try to opt the cam pose and 3d human to align this, before the results from your MoRF

@IceFtv
Copy link
Author

IceFtv commented Jun 7, 2024

Thanks. I reproduced the effect of male4 with my processed data, and it seems that the position of the smplx model and the 3DGS are not well aligned. https://github.com/david-svitov/HAHA/assets/10354474/d02478fe-44c4-4db4-adf6-a81a78968a3a

now, I try to opt the cam pose and 3d human to align this, before the results from your MoRF

Hi! @zyms5244 I would like to know how you handle the differences between these two .pkl, especially camera_matrix and camera_transform. Thank you.

@zyms5244
Copy link

Thanks. I reproduced the effect of male4 with my processed data, and it seems that the position of the smplx model and the 3DGS are not well aligned. https://github.com/david-svitov/HAHA/assets/10354474/d02478fe-44c4-4db4-adf6-a81a78968a3a
now, I try to opt the cam pose and 3d human to align this, before the results from your MoRF

Hi! @zyms5244 I would like to know how you handle the differences between these two .pkl, especially camera_matrix and camera_transform. Thank you.

add get_camera_matrix function to camera.py
and set matrixs like this

          size_x, size_y = img.shape[1], img.shape[0]
          camere_matrix = camera.get_camera_matrix("cpu", size_x, size_y)[0]
          camera_transform = transform_mat(camera.rotation,
                                          camera.translation.unsqueeze(dim=-1))[0]

          result['camera_matrix'] = camere_matrix.detach().numpy()
          result['camera_transform'] = camera_transform.detach().cpu().numpy()
          result['transl'] = np.zeros([1,3])

@dilinwang820
Copy link

In the camera.py add this to the PerspectiveCamera class:

NEAR = 0.01
FAR = 100

def get_camera_matrix(self, device, size_x, size_y):
    with torch.no_grad():
        camera_mat = torch.zeros([self.batch_size, 4, 4],
                                 dtype=self.dtype, device=device)
        camera_mat[:, 0, 0] = 2.0 * self.focal_length_x / size_x
        camera_mat[:, 1, 1] = 2.0 * self.focal_length_y / size_y
        camera_mat[:, 3, 2] = 1.0
        camera_mat[:, 2, 2] = -(self.FAR + self.NEAR) / (self.NEAR - self.FAR)
        camera_mat[:, 2, 3] = (2 * self.FAR * self.NEAR) / (self.NEAR - self.FAR)

    return camera_mat

Hi @david-svitov, could you kindly share the rest of your camera.py implementation? I followed the instructions in this thread, and my projection seems fine by setting visualize=True; however, the nvdiffrast rendering is slightly misaligned with GT. Do you have any insights on this? Thanks.

@david-svitov
Copy link
Owner

@dilinwang820
Copy link

@dilinwang820 Here is SMPLify-X with my modifications: https://drive.google.com/file/d/1FJhC7CQeeXVWDK5B3MoKLomWu-M9tEmD/view?usp=sharing

Thank you, this is extremely helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants