A curated list of awesome projects, resources, and research related to Embodied AI and Humanoid Robotics.
- Introduction
- Scene Understanding
- Data Collection
- Action Output
- Open Source Projects
- Datasets
- Companies & Robots
- Research Papers
- Community
Embodied Intelligence refers to AI systems that learn and interact through physical embodiment, often in robotics platforms. This list focuses on projects combining artificial intelligence with physical interaction capabilities.
| Description | Paper | Code | |
|---|---|---|---|
| SAM | Segmentation | Paper | Code |
| YOLO-World | Open-Vocabulary Detection | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| SAM3D | Segmentation | Paper | Code |
| PointMixer | Understanding | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| GPT4V | MLM(Image+Language->Language) | Paper | |
| Claude3-Opus | MLM(Image+Language->Language) | Paper | |
| GLaMM | Pixel Grounding | Paper | Code |
| All-Seeing | Pixel Grounding | Paper | Code |
| LEO | 3D | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| Vid2Robot | Paper | ||
| RT-Trajectory | Paper | ||
| MimicPlay | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| UMI | Two-Fingers | Paper | Code |
| DexCap | Five-Fingers | Paper | Code |
| HIRO Hand | Hand-over-hand | Paper |
| Description | Paper | Code | |
|---|---|---|---|
| MimicGen | Paper | Code | |
| RoboGen | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| Diffusion Policy | Paper | Code | |
| ACT | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| CLIPort | Pick&Place | Paper | Code |
| Robo-Affordances | Contact&Post-contact trajectories | Paper | Code |
| Robo-ABC | Paper | Code | |
| Where2Explore | Few shot learning from semantic similarity | Paper | |
| Move as You Say | Affordance to motion from diffusion model | Paper | |
| AffordanceLLM | Grounding affordance with LLM | Paper | |
| Environment-aware Affordance | Paper | ||
| OpenAD | Open-Voc Affordance Detection | Paper | Code |
| RLAfford | End-to-End affordance learning | Paper | |
| General Flow | Collect affordance from video | Paper | Code |
| PreAffordance | Pre-grasping planning | Paper | |
| ScenFun3d | Fine-grained functionality | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| COPA | Paper | ||
| ManipLLM | Paper | ||
| ManipVQA | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| OLAF | Paper | ||
| YAYRobot | Paper | Code |
| Description | Paper | Code | |
|---|---|---|---|
| SayCan | API Level | Paper | Code |
| VILA | Prompt Level | Paper |
- Genesis - Genesis is a physics platform designed for general-purpose Robotics/Embodied AI/Physical AI applications. It is simultaneously multiple things: A universal physics engine re-built from the ground up, capable of simulating a wide range of materials and physical phenomena.; A lightweight, ultra-fast, pythonic, and user-friendly robotics simulation platform. [⭐22747]
- Habitat Sim - A flexible, high-performance 3D simulator for Embodied AI research. Habitat-Sim is typically used with Habitat-Lab, a modular high-level library for end-to-end experiments in embodied AI. [⭐5000]
- Isaac Sim - NVIDIA Isaac Sim™ is a reference application built on NVIDIA Omniverse that enables developers to simulate and test AI-driven robotics solutions in physically based virtual environments.
- MuJoCo - Physics engine for robotics, biomechanics, and graphics
- PyBullet - Physics simulation for robotics and machine learning
- Gazebo - Robot simulation environment
- SAPIEN - A SimulAted Part-based Interactive ENvironment
- Webots - Webots is an open source and multi-platform desktop application used to simulate robots. It provides a complete development environment to model, program and simulate robots.
- AnyTeleop - A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System
- Dex Retargeting - Various retargeting optimizers to translate human hand motion to robot hand motion.
- ChatTTS - A generative speech model for daily dialogue
- Real3D-Portrait - One-shot Realistic 3D Talking Portrait Synthesis
- Hallo2 - Long-Duration and High-Resolution Audio-driven Portrait Image Animation
- MoveIt - Motion planning framework
- OMPL - Open Motion Planning Library
- Drake - Planning and control toolkit
- GR-1 - ByteDance Research: Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
- Universal Manipulation Interface(UMI)
- Meta Motivo - A first-of-its-kind behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.
- OpenVLA - OpenVLA sets a new state of the art for generalist robot manipulation policies. It supports controlling multiple robots out of the box and can be quickly adapted to new robot setups via parameter-efficient fine-tuning.
- RoboNet - Large-scale robot manipulation dataset
- BEHAVIOR-1K - Dataset of human household activities
- Google Robot Data - Various robotics datasets
- Contact-GraspNet - Dataset for robotic grasping
- π0 - Opensource VLA models by Physical Intelligence
- AMASS - Large human motion dataset
- Human3.6M - Large-scale dataset for human sensing
- CMU Graphics Lab Motion Capture Database
- CALVIN - CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
- DexYCB - A Benchmark for Capturing Hand Grasping of Objects
- Berkeley AI Research (BAIR)
- Stanford Robotics & Embodied Artificial Intelligence Lab (REAL)
- MIT CSAIL
- Google Research
- UT Austin Robot Perception and Learning Lab (RPL)
- Boston Dynamics
- Agility Robotics
- Figure AI
- 1X Technologies
- Sanctuary AI
- Fourier Intelligence [Github] [Docs]
- AgiBot Github
- Atlas - Advanced humanoid robot by Boston Dynamics
- Digit - Bipedal robot by Agility Robotics
- Tesla Optimus - Tesla's humanoid robot project
- Figure 01 - General purpose humanoid robot
- Phoenix - Open-source humanoid robot design
- Fourier GR-2 - Fourier General Purpose Humanoid Robotics
- OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation
- GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy
- RSS - Robotics: Science and Systems
- ICRA - International Conference on Robotics and Automation
- CoRL - Conference on Robot Learning
- IROS - International Conference on Intelligent Robots and Systems
Please read the contribution guidelines before making a contribution.
This project is licensed under the MIT License - see the LICENSE file for details.