You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for this excellent work.
The results on people_snapshot_public are quite good. But when I run on UBC data, the results are quite noisy, as:
May I ask where is the problem? Thanks
The text was updated successfully, but these errors were encountered:
Yes, we do observe the noisy T-pose results, but when we look at the training poses or some poses near human poses in the input video, it will look better.
We guess this may be due to the highly dynamic clothes are hard to generalize to novel poses given the very limited observations in UBC. Also, note that in-the-wild UBC seqs with dynamic clothes are extremely more challenging than People-Snapshot and ZJU-MoCap because the poses are inaccurate and quite noisy, as well as the pose distribution is singular and only a few side view frames are provided in the video. However, when you look at the baseline, it is even worse. We hope our first small step reveals some new challenges for this in-the-wild problem.
Another tip you may notice in our code is that I intentionally leave the SD guidance and real video-fitting steps in one file.
Our very early results suggest that a hybrid of the SD guidance and the real fitting will largely help to address this issue. But I haven't had time to implement this in the code release.
Hi, thanks for this excellent work.
The results on people_snapshot_public are quite good. But when I run on UBC data, the results are quite noisy, as:
May I ask where is the problem? Thanks
The text was updated successfully, but these errors were encountered: