-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Real Robot Evaluation #14
Comments
Hey, happy to see you're making so much progress! I used their codebase as a starting point, and made my own (hacky) scripts (scalingup_real_scripts.zip).
Hopefully these will still give you a good starting point! |
Hi, @huy-ha Thank you very much for your help! BTW, I find that my policy tends to start a transport action without actually grasping a object (e.g., if one finger contacts another without grasping a object, the robot still continue to move to the target bin). I wonder if it is related to the collision detection setup? Best regards |
Hi @wangyan-hlab , Happy to see you did the successful real experiments! I am still struggling with that. In real experiments, the output (action sequences) is always pointing to a weird direction. Can you please advise me on the following questions:
Appreciate your kind reply and help! Best regards |
Hey @wangyan-hlab, Great to see you've done a first round of evaluations.
Hey @yellow07200 , In my experiments, since I used domain randomization over camera poses, I didn't have to calibrate. I just placed the camera in front of the robot where it roughly matched.
Did you load the action normalization in from the checkpoint? Is it completely off or is it close the the object but not on the object? |
Hi @huy-ha, I didn't set the
Thanks for your help! Best regards |
Hi, @huy-ha Thank you for your reply.
I think I will first try to increase the magnitude of visual augmentations and improve the performance of sim evaluation. Best regards |
Oh interesting. And this is with your FR5 setup right? Did the policy's accuracy on weights and biases converge yet, and how many datapoints did you use? Did you also try reproducing the transport results with the codebase's UR5, and did that work? |
Yes, it is with my FR5 setup. |
Hi @huy-ha,
I am facing the same issue when I change my setup to UR10. Additionally, after I used domain randomization over camera poses to generate the dataset using the same setup in the original code (UR5), The training success rate is also only around 20%. Do you have any idea why this happens? Many thanks. Best regards |
@yellow07200 Could you share some code to reproduce the UR10 setup? Also, in my case, domain randomization with the original UR5 setup achieves >80%, so this is unexpected as well. Did you install the conda environment exactly as in the provided yaml file? @wangyan-hlab That loss seems normal to me, but the behavior is very surprising. Can I reproduce this result with the latest commit from #18? |
@huy-ha Hi, huy. Yes, I think the result would be reproduced with the latest commit from #18. Please let me know if there's any problem. Thank you very much. |
Hey @wangyan-hlab , Thanks for being patience. Compute was tight due to CVPR deadline. The steps I took include:
Using the default configuration (10000 steps x 10 epochs), the diffusion loss at epoch 6 is 0.0038. MSE Loss /mean and /best were 0.0348 and 0.00786 respectively. Below I've attached some visualizations. https://github.com/real-stanford/scalingup/assets/33562579/4d470a22-8c2c-41e2-85e6-ab3d2ad514c3 Qualitatively, the policy does exhibit retrying behavior. Quantitatively, its plateauing at 70% success rate, but I think its mostly due to the policy running out of time. In summary, our data should have been identical, but I just ran data generation for longer to get more data. Our policy training configuration is identical except for how long I trained for. However, the results above were from epoch 6, so that policy didn't train much longer than yours did. I'm still surprised about your result. I don't think the difference in the amount of data should have resulted in such a big difference. I would have guessed that with 600 trajectories yours would have achieved about 60% or so. I'll run another training with roughly a similar amount of data and let you know if it also performs poorly. |
Hi Huy @huy-ha, Thank you very much for reproducing the results on FR5 robot! Please allow me to briefly summarize the differences between your reproducing setup and mine:
I'm really happy to see your excellent reproducing results but also surprised about the differences. Due to the limit of my device and time, I wasn't able to generate too much data for training. But as you say, my success rate was much lower than expected with 600 trajectories. I really appreciate your help and look forward to your reply. Good luck to you and your team at the CVPR! Best regards |
Yep! However, I don't think 1) contributed any difference because the data both you and I had succeeded around 70% of the time and had retry attempts. 2) and 3) are the significant differences. I'll let you know when I get the results. |
Hey @wangyan-hlab , Just a quick update.
Not surprisingly, top down camera views are better this grasping task than wrist mounted ones, and more data does better. You used 662 trajs/68182pts, but only got 20 - 30%. I think it can still reach close to 44% if you just leave it training for longer. Hope these experiment results help! |
Hi Huy, @huy-ha
First, thank you for your kind help so far.
I have been pushing the reproduction work forward and now I think I am ready to evaluate the policy on a real robot.
According to your guidance, I found the diffusion policy repo.
There are 2 questions about it:
diffusion_unet_hybrid_image_policy
?)And do I need to edit the policy to fit your scalingup policy, or modify some other codes?
but the checkpoint from my training seems to have a different structure and there isn't a 'cfg' key there, as well as other keys used in the script.
Would you please give more information about how to modify the script to fix the issue?
Also, is it possible to load a model from the checkpoint and directly predict the action (i.e. eef position+eef uppermat+gripper command) with the REAL observation? I am trying to extract the code instantiating a diffusion policy model, provide a fake input, and hope to get some output:
But I haven't find a proper way to output something.
Would you please give some suggestions?
Best Regards
Originally posted by @wangyan-hlab in #1 (comment)
The text was updated successfully, but these errors were encountered: