❝
导语
这是一份面向“实做派”的 VLA（Vision-Language-Action）开源资源导航：按通用自回归/Transformer、扩散式动作模型、训练/推理提效框架、强基线与新趋势四大维度梳理，优先收录可复现代码、可直接下载的权重、清晰的任务范式与部署入口。

名言

“世界本身就是最好的模型。”——Rodney Brooks

通用 VLA（Transformer/自回归类）

OpenVLA— 简洁可扩展的通用 VLA 训练/微调框架，提供多款预训练检查点。
网址：https://github.com/openvla/openvla

OpenVLA-OFT— 在 OpenVLA 上用双向注意力一次性“填充动作”，显著加速推理。
网址：https://openvla.github.io/oft (openvla-oft.github.io)

SpatialVLA— 强调三维空间表征（3D EPE + 自适应网格），官方放出了多款权重。
网址：https://github.com/SpatialVLA/SpatialVLA（代码） / https://huggingface.co/IPEC-COMMUNITY/spatialvla-4b-224-pt（权重）

NORA— 以 Qwen2.5-VL-3B 为骨干的轻量级 VLA，开源训练代码与模型检查点。
网址：https://github.com/declare-lab/nora（代码） / https://huggingface.co/declare-lab/nora（权重） / https://declare-lab.github.io/nora（项目页） (GitHub)

InternVLA-M1— 上海 AI 实验室开源的“空间落地”通用策略，双头（语言/动作）联合训练。
网址：https://github.com/InternRobotics/InternVLA-M1 (GitHub)

RVT / RVT-2— 多视角 Transformer，语言指令 + 视觉输入 → 位姿/动作；代码官方维护。
网址：https://github.com/NVlabs/RVT（代码） / https://robotic-view-transformer-2.github.io/（RVT-2） (GitHub)

Octo（PyTorch 版）— 通用机器人策略（Open X-Embodiment 多数据集），社区复现易上手。
网址：https://github.com/embodiedai/octo-pytorch (GitHub)

TinyVLA— 追求推理更快/数据更省的紧凑 VLA 家族（官方代码与论文/评测）。
网址：https://github.com/JayceWen/tinyvla（代码） / https://tiny-vla.github.io/（项目页）

扩散/生成式动作模型（Diffusion 方向）

RDT-1B（Robotics Diffusion Transformer）— 10亿参数级通用扩散策略，官方权重在 HF。
网址：https://huggingface.co/intel/rdt-1b (Hugging Face)

MDT（Multimodal Diffusion Transformer）— 支持语言/图像目标的扩散策略框架；附论文。
网址：https://github.com/LeapLabTHU/MDT-policy / 论文：https://arxiv.org/abs/2407.05996 (GitHub)

DiT-Policy（Diffusion Transformer Policy）— 提出 DiT-Block Policy，稳定可扩展；含示例代码。
网址：https://github.com/dit-policy/dit-policy / https://www.dit-policy.com/ (GitHub)

DexVLA— 面向灵巧手的 VLA：视觉+语言+触觉/动作，含多种抓取/旋转任务脚本。
网址：https://github.com/hanwenzhang123/DexVLA / 项目页 https://dexvla.github.io/ (GitHub)

DP3（3D Diffusion Policy）— 将 3D 表征与扩散策略结合，泛化到多种操控任务。
网址：https://github.com/YanjieZe/3D-Diffusion-Policy (GitHub)

VPP（Video Prediction Policy）— 用视频扩散模型的预测特征来条件化通用策略，附代码与论文。
网址：https://github.com/roboterax/video-prediction-policy / 项目页：https://video-prediction-policy.github.io/

Seer（语言指导的视频预测）— 文本条件视频预测用于策略学习的前端能力。
网址：https://seervideodiffusion.github.io/ / 论文：https://openreview.net/forum?id=qHGgNyQk31 (OpenReview)

训练/推理框架与“能力加成”（在现有 VLA 之上提效）

RoboVLMs— “任意 VLM → VLA”的轻量化框架，30 行内集成主流 VLM；含实机与 SimplerEnv 实验。
网址：https://github.com/Robot-VLAs/RoboVLMs (GitHub)

TraceVLA— 用视觉轨迹提示增强 VLA 的时空感知；官方提供在 OpenVLA 上的微调实现。
网址：https://tracevla.github.io/（项目） / https://github.com/FrankZheng2022/tracevla（实现）

ECoT（Embodied Chain-of-Thought）— 在预测动作前显式生成具身推理步骤以提升可解释性/性能。
网址：https://github.com/MichalZawalski/embodied-CoT / 论文：https://openreview.net/forum?id=S70MgnIA0v

RoboMonkey— 面向部署的测试时抽样+VLM 验证器框架，显著提升 OOD 成功率；附 serving 引擎。
网址：https://robomonkey-vla.github.io/（含 Code/Models 链接） / 代码：https://github.com/robomonkey-vla/RoboMonkey

OpenHelix（Dual-System VLA）— 带小综述与经验分析的双系统开源 VLA 模型与代码。
网址：https://github.com/OpenHelix-robot/OpenHelix

经典语言条件/多模态策略（强基线，常被当作 VLA 对照）

PerAct（Perceiver-Actor）— 语言条件 + 3D 体素表征的多任务模仿学习策略（含真实/仿真）。
网址：https://github.com/peract/peract / 项目：https://peract.github.io/ (GitHub)

HULC— 分层的通用语言条件策略，是 CALVIN 基准上的强力开源基线之一。
网址：https://github.com/lukashermann/hulc

RoboFlamingo— 在 OpenFlamingo 上做机器人适配的VLM→动作范式，CALVIN 上性能强。
网址：https://github.com/RoboFlamingo/RoboFlamingo / 项目：https://roboflamingo.github.io/ (OpenReview)

VIMA（含 VIMA-Bench）— “多模态提示”范式训练通用策略，官方基准与模型实现齐全。
网址：https://github.com/vimalabs/VIMA / https://vimalabs.github.io/ / 基准：https://github.com/vimalabs/VIMABench (GitHub)

Language-Policies（NeurIPS’20）— 早期但经典的语言条件模仿学习实现。
网址：https://github.com/ir-lab/LanguagePolicies

领域/形态专项与新趋势

DexVLA（灵巧手）— 专注 dexterous 操控场景，集成旋转、插接等挑战任务（见“扩散”分组）。
网址：https://github.com/hanwenzhang123/DexVLA (GitHub)

RVT-2（高精操控/小样本）— 聚焦精确位姿与少样本泛化，配套项目页与论文。
网址：https://robotic-view-transformer-2.github.io/ (RVT-2)

GR00T N1 / N1.5（人形/跨形态）— NVIDIA 开源的人形通用策略与评测/仿真工具链；HF 提供模型。
网址：https://github.com/NVIDIA/Isaac-GR00T / https://huggingface.co/nvidia/GR00T-N1-2B / 新闻稿：https://nvidianews.nvidia.com/news/nvidia-isaac-gr00t-n1-open-humanoid-robot-foundation-model-simulation-frameworks

TinyVLA 系（领域适配）（混合）— 官方外也有任务/环境定制权重（如 MetaWorld 适配 head）。
例：https://huggingface.co/hz1919810/TinyVLA-droid_diffusion_metaworld

3D-CAVLA / GeoVLA / PointVLA / BridgeVLA（3D 强化）— 将点云/深度几何注入 VLA，提升空间推理。
网址：https://3d-cavla.github.io/ / https://linsun449.github.io/GeoVLA/ / https://arxiv.org/abs/2503.07511 / https://huggingface.co/papers/2506.07961

OpenDriveVLA（自动驾驶方向）— VLA 思路迁移到端到端驾驶（仓库在持续放出）。
网址：https://github.com/DriveVLA/OpenDriveVLA

额外参考（术语/综述/清单）

Awesome-VLA / LLM-Robotics 汇总仓库：快速扫全景、找更多项目与论文脉络。
网址：https://github.com/Orlando-CS/Awesome-VLA / https://github.com/GT-RIPL/Awesome-LLM-Robotics / https://github.com/jonyzhang2023/awesome-embodied-vla-va-vln (GitHub)

小结与建议

如果你想快速上手+复现强基线：优先选 OpenVLA / SpatialVLA / NORA（训练脚本、推理入口、数据格式都很规整）。

如果你关注推理速度/部署：看 OpenVLA-OFT、TinyVLA、RoboMonkey。

如果你要做灵巧手：从 DexVLA 入手，再配 RDT-1B / MDT/DiT-Policy 做扩散类比较。

如果你的数据/任务带强 3D 要求：试 RVT-2 或 3D-方向（PointVLA / GeoVLA / 3D-CAVLA）。