LLMs for Embodied Intelligence

Here're some resources about LLMs for Embodied Intelligence, espeicially robotics

Cosmos World Foundation Model Platform for Physical AI

tag: Cosmos | Nvidia

paper link: here

code link: here

modelhub link: here

homepage link: here

citation:

@misc{nvidia2025cosmosworldfoundationmodel,
      title={Cosmos World Foundation Model Platform for Physical AI}, 
      author={NVIDIA and : and Niket Agarwal and Arslan Ali and Maciej Bala and Yogesh Balaji and Erik Barker and Tiffany Cai and Prithvijit Chattopadhyay and Yongxin Chen and Yin Cui and Yifan Ding and Daniel Dworakowski and Jiaojiao Fan and Michele Fenzi and Francesco Ferroni and Sanja Fidler and Dieter Fox and Songwei Ge and Yunhao Ge and Jinwei Gu and Siddharth Gururani and Ethan He and Jiahui Huang and Jacob Huffman and Pooya Jannaty and Jingyi Jin and Seung Wook Kim and Gergely Klár and Grace Lam and Shiyi Lan and Laura Leal-Taixe and Anqi Li and Zhaoshuo Li and Chen-Hsuan Lin and Tsung-Yi Lin and Huan Ling and Ming-Yu Liu and Xian Liu and Alice Luo and Qianli Ma and Hanzi Mao and Kaichun Mo and Arsalan Mousavian and Seungjun Nah and Sriharsha Niverty and David Page and Despoina Paschalidou and Zeeshan Patel and Lindsey Pavao and Morteza Ramezanali and Fitsum Reda and Xiaowei Ren and Vasanth Rao Naik Sabavat and Ed Schmerling and Stella Shi and Bartosz Stefaniak and Shitao Tang and Lyne Tchapmi and Przemek Tredak and Wei-Cheng Tseng and Jibin Varghese and Hao Wang and Haoxiang Wang and Heng Wang and Ting-Chun Wang and Fangyin Wei and Xinyue Wei and Jay Zhangjie Wu and Jiashu Xu and Wei Yang and Lin Yen-Chen and Xiaohui Zeng and Yu Zeng and Jing Zhang and Qinsheng Zhang and Yuxuan Zhang and Qingqing Zhao and Artur Zolkowski},
      year={2025},
      eprint={2501.03575},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2501.03575}, 
}

RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

tag: RoboGen | ICML24 | CMU

paper link: here

code link: here

homepage link: here

citation:

@misc{wang2023robogen,
      title={RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation}, 
      author={Yufei Wang and Zhou Xian and Feng Chen and Tsun-Hsuan Wang and Yian Wang and Katerina Fragkiadaki and Zackory Erickson and David Held and Chuang Gan},
      year={2023},
      eprint={2311.01455},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Towards end-to-end embodied decision making via multi-modal large language model: Explorations with gpt4-vision and beyond

tag: World Model | NIPS23 | Tecent Cloud AI | Peking University

paper link: here

code link: here

citation:

@article{chen2023towards,
  title={Towards end-to-end embodied decision making via multi-modal large language model: Explorations with gpt4-vision and beyond},
  author={Chen, Liang and Zhang, Yichi and Ren, Shuhuai and Zhao, Haozhe and Cai, Zefan and Wang, Yuchi and Wang, Peiyi and Liu, Tianyu and Chang, Baobao},
  journal={arXiv preprint arXiv:2310.02071},
  year={2023}
}

Language Models Meet World Models: Embodied Experiences Enhance Language Models

tag: World Model | NIPS23 | UCSD

paper link: here

code link: here

citation:

@article{xiang2023language,
  title={Language Models Meet World Models: Embodied Experiences Enhance Language Models},
  author={Xiang, Jiannan and Tao, Tianhua and Gu, Yi and Shu, Tianmin and Wang, Zirui and Yang, Zichao and Hu, Zhiting},
  journal={arXiv preprint arXiv:2305.10626},
  year={2023}
}

Progprompt: Generating situated robot task plans using large language models

tag: Progprompt | ICRA23 | Nvidia

paper link: here

code link: here

homepage link: here

citation:

@inproceedings{singh2023progprompt,
  title={Progprompt: Generating situated robot task plans using large language models},
  author={Singh, Ishika and Blukis, Valts and Mousavian, Arsalan and Goyal, Ankit and Xu, Danfei and Tremblay, Jonathan and Fox, Dieter and Thomason, Jesse and Garg, Animesh},
  booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={11523--11530},
  year={2023},
  organization={IEEE}
}

Language models as zero-shot planners: Extracting actionable knowledge for embodied agents

tag: Language Planner | ICML22 | UC Berkeley

paper link: here

code link: here

homepage link: here

citation:

@inproceedings{huang2022language,
  title={Language models as zero-shot planners: Extracting actionable knowledge for embodied agents},
  author={Huang, Wenlong and Abbeel, Pieter and Pathak, Deepak and Mordatch, Igor},
  booktitle={International Conference on Machine Learning},
  pages={9118--9147},
  year={2022},
  organization={PMLR}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

embodied.md

embodied.md

LLMs for Embodied Intelligence

Cosmos World Foundation Model Platform for Physical AI

RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

Towards end-to-end embodied decision making via multi-modal large language model: Explorations with gpt4-vision and beyond

Language Models Meet World Models: Embodied Experiences Enhance Language Models

Progprompt: Generating situated robot task plans using large language models

Language models as zero-shot planners: Extracting actionable knowledge for embodied agents

Files

embodied.md

Latest commit

History

embodied.md

File metadata and controls

LLMs for Embodied Intelligence

Cosmos World Foundation Model Platform for Physical AI

RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

Towards end-to-end embodied decision making via multi-modal large language model: Explorations with gpt4-vision and beyond

Language Models Meet World Models: Embodied Experiences Enhance Language Models

Progprompt: Generating situated robot task plans using large language models

Language models as zero-shot planners: Extracting actionable knowledge for embodied agents