Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model Paper • 2510.12276 • Published 19 days ago • 142
What If : Understanding Motion Through Sparse Interactions Paper • 2510.12777 • Published 19 days ago • 5
Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory Paper • 2507.16713 • Published Jul 22 • 21
Trending 3D (Image to 3D) Collection One place to keep track of all 3D demos • 40 items • Updated Aug 7 • 4
A Careful Examination of Large Behavior Models for Multitask Dexterous Manipulation Paper • 2507.05331 • Published Jul 7 • 12
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data Jun 3 • 272
Real-is-Sim: Bridging the Sim-to-Real Gap with a Dynamic Digital Twin for Real-World Robot Policy Evaluation Paper • 2504.03597 • Published Apr 4 • 4
Real-is-Sim: Bridging the Sim-to-Real Gap with a Dynamic Digital Twin for Real-World Robot Policy Evaluation Paper • 2504.03597 • Published Apr 4 • 4
Real-is-Sim: Bridging the Sim-to-Real Gap with a Dynamic Digital Twin for Real-World Robot Policy Evaluation Paper • 2504.03597 • Published Apr 4 • 4 • 2
Theia: Distilling Diverse Vision Foundation Models for Robot Learning Paper • 2407.20179 • Published Jul 29, 2024 • 47 • 3
Theia: Distilling Diverse Vision Foundation Models for Robot Learning Paper • 2407.20179 • Published Jul 29, 2024 • 47
Theia: Distilling Diverse Vision Foundation Models for Robot Learning Paper • 2407.20179 • Published Jul 29, 2024 • 47
theaiinstitute/theia-tiny-patch16-224-cddsv Feature Extraction • 16.2M • Updated Jul 30, 2024 • 2.43k • 4
theaiinstitute/theia-base-patch16-224-cdiv Feature Extraction • 0.1B • Updated Jul 30, 2024 • 4.66k • 8
theaiinstitute/theia-small-patch16-224-cdiv Feature Extraction • 36.7M • Updated Jul 30, 2024 • 53 • 3