Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RT-Dino #33

Open
11 tasks
sebbyjp opened this issue Jun 22, 2024 · 0 comments · May be fixed by #49
Open
11 tasks

RT-Dino #33

sebbyjp opened this issue Jun 22, 2024 · 0 comments · May be fixed by #49

Comments

@sebbyjp
Copy link
Collaborator

sebbyjp commented Jun 22, 2024

DinoV2 with registers backbone, transformer decoder, classifier free guidance film layers training script

☘️ Shoot an email to [email protected] if you'd like to tackle this issue and I'll help as often as I can. Can provide A100 access once script is ready.

Starter Code
Example Doing Identical task but with MaxViT

Resources

Highly-Recommended Guide to Follow
Transformer Head Code
DinoV2 Source Code
Text Guidance with Film
RT1: Robotics Transformers paper

Tokenize Actions (x, y, z, roll, pitch, yaw, grasp)

Transform pattern: (b frames action) -> (b f a bins), bins=255

This is just simple classification not sequence to sequence modeling

  1. Apply MinMax Scaler

  2. Apply kbins

Apply film layers from classifier-free-guidance

Inference pattern: (b f c h w ), str --> (b f a bins)

Example Doing Identical task but with MaxViT

Details

  • Use pytorch lightning, transformers, or fastai (transformers preferred but fastai likely easiest)
  • Use pretrained ViT-g/14 small or large with registers
  • Start with basic encoder-decoder pattern (see the starter code script)

Use the following losses:

Follow-On Work

  • Ablations with early, middle, late fusion
  • Ablations with DinoV2 frozen, dinov2 without registers, smaller or larger dinov2
  • Whiten image inputs with PCA
  • AutoAugment with timm
@emekaokoli19 emekaokoli19 linked a pull request Jul 4, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant