具身目标导航/视觉语言导航/点导航工作汇总!

具身智能之心 2025-08-12 15:04

点击下方卡片,关注“具身智能之心”公众号


编辑丨具身智能之心

本文只做学术分享,如有侵权,联系删文



>>点击进入→具身智能之心技术交流群

更多干货,欢迎加入国内首个具身智能全栈学习社区具身智能之心知识星球(戳我)这里包含所有你想要的。

最近有同学向我们咨询了一些具身导航相关的工作,今天也为大家梳理一下这几年发展的路线和方法论,建议收藏。更多内容欢迎加入国内首个具身智能全栈学习社区:具身智能之心知识星球!

资讯配图

点目标导航工作汇总

Comparison of Model-Free and Model-Based Learning-Informed Planning for PointGoal Navigation

  • 会议/年份:CoRL, 2022
  • 论文链接:https://openreview.net/pdf?id=2s92OhjT4L
  • 代码:https://github.com/yimengli46/bellman_point_goal
  • 项目链接:https://yimengli46.github.io/Projects/CoRL2022LHPWorkshop/index.html

RobustNav: Towards Benchmarking Robustness in Embodied Navigation

  • 会议/年份:ICCV, 2021
  • 论文链接:https://arxiv.org/pdf/2104.04112.pdf
  • 代码:https://github.com/allenai/robustnav
  • 项目链接:https://prior.allenai.org/projects/robustnav

The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

  • 会议/年份:ICCV, 2021
  • 论文链接:https://arxiv.org/pdf/2106.04531.pdf
  • 代码:https://github.com/joel99/objectnav
  • 项目链接:https://xiaoming-zhao.github.io/projects/pointnav-vo

Differentiable SLAM-Net: Learning Particle SLAM for Visual Navigation

  • 会议/年份:CVPR, 2021
  • 论文链接:https://arxiv.org/pdf/2105.07593.pdf

Embodied Visual Navigation with Automatic Curriculum Learning in Real Environments

  • 会议/年份:ICRA, 2021
  • 论文链接:https://arxiv.org/pdf/2009.05429.pdf

Occupancy Anticipation for Efficient Exploration and Navigation

  • 会议/年份:ECCV, 2020
  • 论文链接:http://vision.cs.utexas.edu/projects/occupancy_anticipation/main.pdf
  • 代码:https://github.com/facebookresearch/OccupancyAnticipation
  • 项目链接:http://vision.cs.utexas.edu/projects/occupancy_anticipation/

Auxiliary Tasks Speed Up Learning PointGoal Navigation

  • 会议/年份:CoRL, 2020
  • 论文链接:https://arxiv.org/abs/2007.04561
  • 代码:https://github.com/joel99/habitat-pointnav-aux

Learning to Explore using Active Neural SLAM

  • 会议/年份:ICLR, 2020
  • 论文链接:https://openreview.net/pdf?id=HklXn1BKDH
  • 代码:https://github.com/devendrachaplot/Neural-SLAM
  • 项目链接:https://devendrachaplot.github.io/projects/Neural-SLAM

DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames

  • 会议/年份:ICLR, 2020
  • 论文链接:https://arxiv.org/abs/1911.00357
  • 代码:https://github.com/facebookresearch/habitat-api/tree/master/habitat_baselines/rl/ddppo
  • 项目链接:https://wijmans.xyz/publication/ddppo-2019/

A Behavioral Approach to Visual Navigation with Graph Localization Networks

  • 会议/年份:RSS, 2019.
  • 论文链接:https://arxiv.org/pdf/1903.00445.pdf
  • 代码:https://github.com/kchen92/graphnav
  • 项目链接:https://graphnav.stanford.edu/

SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation

  • 会议/年份:ICCV, 2019
  • 论文链接:https://arxiv.org/pdf/1905.07512.pdf
  • 代码:https://github.com/facebookresearch/splitnet

Habitat: A Platform for Embodied AI Research

  • 会议/年份:ICCV, 2019
  • 论文链接:https://arxiv.org/abs/1904.01201
  • 代码:https://github.com/facebookresearch/habitat-api
  • 项目链接:https://aihabitat.org/

Cognitive Mapping and Planning for Visual Navigation

  • 会议/年份:CVPR, 2017
  • 论文链接:https://arxiv.org/abs/1702.03920

视听导航工作汇总

Learning Semantic-Agnostic and Spatial-Aware Representation for Generalizable Visual-Audio Navigation

  • 会议/年份: IEEE RA-L 2023
  • 论文链接: https://arxiv.org/pdf/2304.10773.pdf

Pay Self-Attention to Audio-Visual Navigation

  • 会议/年份: BMVC 2022
  • 论文链接: https://arxiv.org/pdf/2210.01353.pdf
  • 主页链接: https://yyf17.github.io/FSAAVN/index.html

SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

  • 会议/年份: arXiv 2022
  • 论文链接: https://arxiv.org/pdf/2206.08312.pdf
  • 代码: https://github.com/facebookresearch/sound-spaces
  • 主页链接: https://vision.cs.utexas.edu/projects/soundspaces2

Sound Adversarial Audio-Visual Navigation

  • 会议/年份: ICLR 2022
  • 论文链接: https://openreview.net/pdf?id=NkZq4OEYN-
  • 代码: https://github.com/yyf17/SAAVN/tree/main
  • 主页链接: https://yyf17.github.io/SAAVN

Active Audio-Visual Separation of Dynamic Sound Sources

  • 会议/年份: ECCV 2022
  • 论文链接: https://arxiv.org/abs/2202.00850
  • 主页链接: http://vision.cs.utexas.edu/projects/active-av-dynamic-separation/

Move2Hear: Active Audio-Visual Source Separation

  • 会议/年份: ICCV 2021
  • 论文链接: https://arxiv.org/abs/2105.07142
  • 主页链接: http://vision.cs.utexas.edu/projects/move2hear/

Semantic Audio-Visual Navigation

  • 会议/年份: CVPR 2021
  • 论文链接: https://arxiv.org/pdf/2012.11583.pdf
  • 代码: https://github.com/facebookresearch/sound-spaces/tree/main/ss_baselines/savi
  • 主页链接: http://vision.cs.utexas.edu/projects/semantic_audio_visual_navigation/

Learning to Set Waypoints for Audio-Visual Navigation

  • 会议/年份: ICLR 2021
  • 论文链接: https://arxiv.org/pdf/2008.09622.pdf
  • 主页链接: http://vision.cs.utexas.edu/projects/audio_visual_waypoints/

Look, Listen, and Act: Towards Audio-Visual Embodied Navigation

  • 会议/年份: ICRA 2020
  • 论文链接: https://arxiv.org/pdf/2012.11583.pdf
  • 主页链接: http://vision.cs.utexas.edu/projects/semantic_audio_visual_navigation

Audio-Visual Embodied Navigation

  • 会议/年份: ECCV 2020
  • 论文链接: https://arxiv.org/pdf/1912.11474.pdf
  • 主页链接: http://vision.cs.utexas.edu/projects/audio_visual_navigation/

ObjectGoal导航工作汇总

DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects

  • 会议/年份:arXiv 2024
  • 论文链接:https://arxiv.org/abs/2410.02730
  • 主页链接:https://zhaowei-wang-nlp.github.io/divscene-project-page/

MOPA: Modular Object Navigation with PointGoal Agents

  • 会议/年份:WACV 2024
  • 论文链接:https://openaccess.thecvf.com/content/WACV2024/html/Raychaudhuri_MOPA_Modular_Object_Navigation_With_PointGoal_Agents_WACV_2024_paper.html
  • 主页链接:https://youtu.be/Jcspov0UpsA

Self-Supervised Object Goal Navigation with In-Situ Finetuning

  • 会议/年份:IROS 2023
  • 论文链接:https://arxiv.org/abs/2212.05923
  • 主页链接:https://www.youtube.com/watch?v=LXsZst5ZUpU

Good Time to Ask: A Learning Framework for Asking for Help in Embodied Visual Navigation

  • 会议/年份:arXiv 2022
  • 论文链接:https://arxiv.org/abs/2206.10606
  • 代码:https://github.com/jennyzzt/good_time_to_ask

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

  • 会议/年份:arXiv 2022
  • 论文链接:https://arxiv.org/pdf/2206.06994.pdf
  • 主页链接:https://procthor.allenai.org/

THDA: Treasure Hunt Data Augmentation for Semantic Navigation

  • 会议/年份:ICCV, 2021
  • 论文链接:https://openaccess.thecvf.com/content/ICCV2021/papers/Maksymets_THDA_Treasure_Hunt_Data_Augmentation_for_Semantic_Navigation_ICCV_2021_paper.pdf

Hierarchical Object-to-Zone Graph for Object Navigation

  • 会议/年份:ICCV 2021
  • 论文链接:https://arxiv.org/abs/2109.02066
  • 代码:https://github.com/sx-zhang/HOZ.git
  • 主页链接:https://drive.google.com/file/d/1UtTcFRhFZLkqgalKom6_9GpQmsJfXAZC/view

Auxiliary Tasks and Exploration Enable ObjectGoal Navigation

  • 会议/年份:ICCV 2021
  • 论文链接:https://arxiv.org/pdf/2108.11550.pdf
  • 代码:https://github.com/Xiaoming-Zhao/PointNav-VO
  • 主页链接:https://joel99.github.io/objectnav/

Visual Navigation With Spatial Attention

  • 会议/年份:CVPR 2021
  • 论文链接:https://arxiv.org/pdf/2104.09807.pdf

VTNet: Visual Transformer Network for Object Goal Navigation

  • 会议/年份:ICLR 2021
  • 论文链接:https://arxiv.org/pdf/2009.07783.pdf

Learning hierarchical relationships for object-goal navigation

  • 会议/年份:CoRL, 2020
  • 论文链接:https://arxiv.org/abs/2003.06749

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

  • 会议/年份:NeurIPS 2020
  • 论文链接:https://arxiv.org/abs/2012.03912
  • 代码:https://github.com/saimwani/multiON
  • 主页链接:https://shivanshpatel35.github.io/multi-ON/

ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects

  • 会议/年份:arXiv, 2020
  • 论文链接:https://arxiv.org/abs/2006.13171

Semantic Visual Navigation by Watching YouTube Videos

  • 会议/年份:arXiv 2020
  • 论文链接:https://arxiv.org/pdf/2006.10034.pdf
  • 主页链接:https://matthewchang.github.io/value-learning-from-videos/

Learning Object Relation Graph and Tentative Policy for Visual Navigation

  • 会议/年份:ECCV, 2020
  • 论文链接:https://arxiv.org/abs/2007.11018

Object Goal Navigation using Goal-Oriented Semantic Exploration

  • 会议/年份:NeurIPS 2020
  • 论文链接:https://arxiv.org/pdf/2007.00643.pdf
  • 主页链接:https://devendrachaplot.github.io/projects/semantic-exploration

Simultaneous Mapping and Target Driven Navigation

  • 会议/年份:arXiv, 2019
  • 论文链接:https://arxiv.org/abs/1911.07980

Situational Fusion of Visual Representation for Visual Navigation

  • 会议/年份:ICCV, 2019
  • 论文链接:https://arxiv.org/abs/1908.09073

Bayesian Relational Memory for Semantic Visual Navigation

  • 会议/年份:ICCV 2019
  • 论文链接:https://arxiv.org/abs/1909.04306
  • 代码:https://github.com/jxwuyi/HouseNavAgent

Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning

  • 会议/年份:CVPR 2019
  • 论文链接:https://arxiv.org/abs/1812.00971
  • 代码:https://github.com/allenai/savn
  • 主页链接:https://prior.allenai.org/projects/savn

Visual Representations for Semantic Target Driven Navigation

  • 会议/年份:ICRA 2019
  • 论文链接:https://arxiv.org/pdf/1805.06066.pdf
  • 代码:https://github.com/arsalan-mousavian/Navigation

Visual Semantic Navigation using Scene Priors

  • 会议/年份:ICLR, 2019
  • 论文链接:https://arxiv.org/abs/1810.06543

Cognitive Mapping and Planning for Visual Navigation

  • 会议/年份:CVPR 2017
  • 论文链接:https://arxiv.org/abs/1702.03920

ImageGoal导航工作汇总

Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

  • 会议/年份:CVPR 2024
  • 论文链接:https://xiaohanlei.github.io/projects/IEVE/IEVE.pdf
  • 主页链接:https://xiaohanlei.github.io/projects/IEVE/

Renderable Neural Radiance Map for Visual Navigation

  • 会议/年份:CVPR 2023
  • 论文链接:https://openaccess.thecvf.com/content/CVPR2023/html/Kwon_Renderable_Neural_Radiance_Map_for_Visual_Navigation_CVPR_2023_paper.html
  • 主页链接:https://rllab-snu.github.io/projects/RNR-Map/

Last-Mile Embodied Visual Navigation

  • 会议/年份:CoRL 2022
  • 论文链接:https://arxiv.org/abs/2211.11746
  • 代码:https://github.com/Jbwasse2/SLING
  • 主页链接:https://jbwasse2.github.io/portfolio/SLING/

Topological Semantic Graph Memory for Image-Goal Navigation

  • 会议/年份:CoRL 2022
  • 论文链接:https://openreview.net/pdf?id=xjTUxBfIzE
  • 代码:https://github.com/rllab-snu/TopologicalSemanticGraphMemory
  • 主页链接:https://github.com/bareblackfoot/Topological-Semantic-Graph-Memory

No RL, No Simulation: Learning to Navigate without Navigating

  • 会议/年份:NeurIPS 2021
  • 论文链接:https://arxiv.org/pdf/2110.09470.pdf

Visual Graph Memory with Unsupervised Representation for Visual Navigation

  • 会议/年份:ICCV 2021
  • 论文链接:https://openaccess.thecvf.com/content/ICCV2021/papers/Kwon_Visual_Graph_Memory_With_Unsupervised_Representation_for_Visual_Navigation_ICCV_2021_paper.pdf
  • 代码:https://github.com/rllab-snu/Visual-Graph-Memory
  • 主页链接:https://rllab-snu.github.io/projects/vgm/doc.html

Learning View and Target Invariant Visual Servoing for Navigation

  • 会议/年份:ICRA 2020
  • 论文链接:https://arxiv.org/pdf/2003.02327.pdf
  • 代码:https://github.com/GMU-vision-robotics/View-Invariant-Visual-Servoing-for-Navigation
  • 主页链接:https://yimengli46.github.io/Projects/ICRA2020/index.html

Neural Topological SLAM for Visual Navigation

  • 会议/年份:CVPR 2020
  • 论文链接:https://arxiv.org/pdf/2005.12256.pdf
  • 主页链接:https://devendrachaplot.github.io/projects/Neural-Topological-SLAM

Semi-Parametric Topological Memory for Navigation

  • 会议/年份:ICLR 2018
  • 论文链接:https://arxiv.org/pdf/1803.00653.pdf
  • 代码:https://github.com/nsavinov/SPTM
  • 主页链接:https://sites.google.com/view/SPTM

Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning

  • 会议/年份:ICRA 2017
  • 论文链接:https://arxiv.org/abs/1609.05143
  • 主页链接:https://prior.allenai.org/projects/target-driven-visual-navigation

视觉语言导航工作汇总

SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments

  • 会议/年份: ICPR, 2022
  • 论文链接: https://arxiv.org/abs/2108.11945
  • 主页链接: https://zubair-irshad.github.io/projects/SASRA.html
  • Video: https://www.youtube.com/watch?v=DsziGtgaJC0

Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments

  • 会议/年份: EMNLP, 2021
  • 论文链接: https://aclanthology.org/2021.emnlp-main.328
  • 代码: https://github.com/3dlg-hcvc/LAW-VLNCE
  • 主页链接: https://3dlg-hcvc.github.io/LAW-VLNCE
  • Video: https://www.youtube.com/watch?v=7dRymdCIAvo

History Aware Multimodal Transformer for Vision-and-Language Navigation

  • 会议/年份: NeurIPS, 2021
  • 论文链接: https://arxiv.org/pdf/2110.13309.pdf
  • 代码: https://github.com/cshizhe/VLN-HAMT
  • 主页链接: https://cshizhe.github.io/projects/vln_hamt.html

SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation

  • 会议/年份:NeurIPS, 2021
  • 论文链接: https://arxiv.org/pdf/2110.14143.pdf

Curriculum Learning for Vision-and-Language Navigation

  • 会议/年份: NeurIPS, 2021
  • 论文链接: https://arxiv.org/pdf/2111.07228.pdf
  • 代码: https://github.com/IMNearth/Curriculum-Learning-For-VLN

Airbert: In-domain Pretraining for Vision-and-Language Navigation

  • 会议/年份: ICCV, 2021
  • 论文链接: https://arxiv.org/pdf/2108.09105.pdf
  • 代码: https://github.com/airbert-vln/airbert
  • 主页链接: https://airbert-vln.github.io/

Waypoint Models for Instruction-guided Navigation in Continuous Environments

  • 会议/年份: ICCV, 2021
  • 论文链接: https://arxiv.org/pdf/2106.07876.pdf
  • 代码: https://github.com/jacobkrantz/VLN-CE
  • 主页链接: https://jacobkrantz.github.io/waypoint-vlnce/

Vision-Language Navigation with Random Environmental Mixup

  • 会议/年份: ICCV, 2021
  • 论文链接: https://arxiv.org/pdf/2106.07876.pdf
  • 代码: https://github.com/LCFractal/VLNREM

Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation

  • 会议/年份: ICCV, 2021
  • 论文链接: https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Self-Motivated_Communication_Agent_for_Real-World_Vision-Dialog_Navigation_ICCV_2021_paper.pdf

Episodic Transformer for Vision-and-Language Navigation

  • 会议/年份: ICCV, 2021
  • 论文链接: https://arxiv.org/pdf/2105.06453.pdf
  • 代码: https://github.com/alexpashevich/E.T.
  • 主页链接: https://sites.google.com/view/episodictransformer

Pathdreamer: A World Model for Indoor Navigation

  • 会议/年份: ICCV, 2021
  • 论文链接: https://arxiv.org/pdf/2105.08756.pdf
  • 代码: https://github.com/google-research/pathdreamer
  • 主页链接: https://google-research.github.io/pathdreamer/

The Road To Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation

  • 会议/年份: ICCV, 2021
  • 论文链接: https://arxiv.org/pdf/2104.04167.pdf
  • 代码: https://github.com/YuankaiQi/ORIST

Neighbor-view Enhanced Model for Vision and Language Navigation

  • 会议/年份: ACM MM, 2021
  • 论文链接: https://arxiv.org/pdf/2107.07201.pdf
  • 代码: https://github.com/MarSaKi/NvEM

Scene-Intuitive Agent for Remote Embodied Visual Grounding

  • 会议/年份: CVPR, 2021
  • 论文链接: https://arxiv.org/pdf/2103.12944.pdf

Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression

  • 会议/年份: CVPR, 2021
  • 论文链接: https://openaccess.thecvf.com/content/CVPR2021/papers/Gao_Room-and-Object_Aware_Knowledge_Reasoning_for_Remote_Embodied_Referring_Expression_CVPR_2021_paper.pdf

SOON: Scenario Oriented Object Navigation With Graph-Based Exploration

  • 会议/年份: CVPR, 2021
  • 论文链接: https://arxiv.org/pdf/2103.17138.pdf

Topological Planning With Transformers for Vision-and-Language Navigation

  • 会议/年份: CVPR, 2021
  • 论文链接: https://arxiv.org/pdf/2012.05292.pdf

Structured Scene Memory for Vision-Language Navigation

  • 会议/年份: CVPR, 2021
  • 论文链接: https://arxiv.org/pdf/2103.03454.pdf
  • 代码: https://github.com/HanqingWangAI/SSM-VLN

VLN BERT: A Recurrent Vision-and-Language BERT for Navigation

  • 会议/年份: CVPR, 2021
  • 论文链接: https://arxiv.org/abs/2011.13922
  • 代码: https://github.com/YicongHong/Recurrent-VLN-BERT

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation

  • 会议/年份: ICRA, 2021
  • 论文链接: https://arxiv.org/abs/2104.10674
  • 代码: https://github.com/GT-RIPL/robo-vln
  • 主页链接: https://zubair-irshad.github.io/projects/robo-vln.html
  • Video: https://www.youtube.com/watch?v=y16x9n_zP_4

Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule

  • 会议/年份: ICLR, 2021
  • 论文链接: https://arxiv.org/pdf/2009.07783.pdf

Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning

  • 会议/年份: TCSVT, 2020
  • 论文链接: https://arxiv.org/pdf/2011.10972.pdf

Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation

  • 会议/年份: NeurIPS, 2020
  • 论文链接: https://proceedings.neurips.cc/paper/2020/file/eddb904a6db773755d2857aacadb1cb0-Paper.pdf

Counterfactual Vision-and-Language Navigation: Unravelling the Unseen

  • 会议/年份: NeurIPS, 2020

  • 论文链接: https://proceedings.neurips.cc/paper/2020/file/39016cfe079db1bfb359ca72fcba3fd8-Paper.pdf

  • 论文标题: Language and Visual Entity Relationship Graph for Agent Navigation

  • 会议/年份: NeurIPS, 2020

  • 论文链接: https://arxiv.org/abs/2010.09304

  • 代码: https://github.com/YicongHong/Entity-Graph-VLN

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation

  • 会议/年份: ECCV, 2020
  • 论文链接: https://arxiv.org/abs/2003.00443

Active Visual Information Gathering for Vision-Language Navigation

  • 会议/年份: ECCV, 2020
  • 论文链接: https://arxiv.org/abs/2007.08037
  • 代码: https://github.com/HanqingWangAI/Active_VLN

Soft Expert Reward Learning for Vision-and-Language Navigation

  • 会议/年份: ECCV, 2020
  • 论文链接: https://arxiv.org/abs/2007.10835

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web

  • 会议/年份: ECCV, 2020
  • 论文链接: https://arxiv.org/abs/2004.14973

Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling

  • 会议/年份: ECCV, 2020
  • 论文链接: https://arxiv.org/abs/1911.07308
  • 代码: https://github.com/jacobkrantz/VLN-CE
  • 主页链接: https://jacobkrantz.github.io/vlnce

Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments

  • 会议/年份: ECCV, 2020
  • 论文链接: https://arxiv.org/abs/2004.02857
  • 代码: https://github.com/jacobkrantz/VLN-CE
  • 主页链接: https://jacobkrantz.github.io/vlnce

Sub-Instruction Aware Vision-and-Language Navigation

  • 会议/年份: arXiv, 2020
  • 论文链接: https://arxiv.org/abs/2004.02707

Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation

  • 会议/年份: - arXiv, 2020
  • 论文链接: https://arxiv.org/abs/2003.14269

Vision-Dialog Navigation by Exploring Cross-modal Memory

  • 会议/年份: CVPR, 2020
  • 论文链接: https://arxiv.org/abs/2003.06745
  • 代码: https://github.com/yeezhu/CMN.pytorch

Multi-View Learning for Vision-and-Language Navigation

  • 会议/年份: arXiv, 2020
  • 论文链接: https://arxiv.org/abs/2003.00857

Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling

  • 会议/年份: ECCV, 2020
  • 论文链接: https://arxiv.org/abs/1911.07308

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation

  • 会议/年份: ECCV, 2020
  • 论文链接: https://arxiv.org/abs/2003.00443

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

  • 会议/年份: CVPR, 2020
  • 论文链接: https://arxiv.org/abs/2002.10638
  • 代码: https://github.com/weituo12321/PREVALENT

Just Ask: An Interactive Learning Framework for Vision and Language Navigation

  • 会议/年份: AAAI, 2020
  • 论文链接: https://arxiv.org/abs/1912.00915

Perceive, Transform, and Act: Multi-Modal Attention Networks for Vision-and-Language Navigation

  • 会议/年份: arXiv, 2019
  • 论文链接: https://arxiv.org/abs/1911.12377
  • 代码: https://github.com/aimagelab/perceive-transform-and-act

Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks

  • 会议/年份: CVPR, 2020
  • 论文链接: https://arxiv.org/abs/1911.07883

Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation

  • 会议/年份: CVPR, 2020
  • 论文链接: https://arxiv.org/abs/1911.07450

Transferable Representation Learning in Vision-and-Language Navigation

  • 会议/年份: ICCV, 2019
  • 论文链接: https://arxiv.org/abs/1908.03409

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters

  • 会议/年份: BMVC, 2019
  • 论文链接: https://arxiv.org/abs/1907.02985
  • 代码: https://github.com/aimagelab/DynamicConv-agent

Chasing Ghosts: Instruction Following as Bayesian State Tracking

  • 会议/年份: NeurIPS, 2019
  • 论文链接: https://arxiv.org/abs/1907.02022
  • 代码: https://github.com/batra-mlp-lab/vln-chasing-ghosts
  • Video: https://www.youtube.com/watch?v=eoGbescCNP0

Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning

  • 会议/年份: EMNLP, 2019
  • 论文链接: https://arxiv.org/abs/1812.04155
  • 代码: https://github.com/debadeepta/vnla
  • Video: https://youtu.be/Vp6C29qTKQ0

Vision-based Navigation with Language-based Assistance via Imitation Learning with Indirect Intervention

  • 会议/年份: CVPR, 2019
  • 论文链接: https://arxiv.org/abs/1909.01871
  • 代码: https://github.com/khanhptnk/hanna
  • Video: https://youtu.be/18P94aaaLKg

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

  • 会议/年份: CVPR, 2019
  • 论文链接: http://arxiv.org/abs/1903.02547
  • 代码: https://github.com/Kelym/FAST
  • Video: https://www.youtube.com/watch?v=AD9TNohXoPA

TOUCHDOWN: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

  • 会议/年份: CVPR, 2019
  • 论文链接: https://arxiv.org/pdf/1811.12354.pdf
  • 代码: https://github.com/lil-lab/touchdown

The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation

  • 会议/年份: CVPR, 2019
  • 论文链接: https://arxiv.org/abs/1903.01602
  • 代码: https://github.com/chihyaoma/regretful-agent
  • 主页链接: https://chihyaoma.github.io/project/2019/02/25/regretful.html

Self-Monitoring Navigation Agent via Auxiliary Progress Estimation

  • 会议/年份: ICLR, 2019
  • 论文链接: https://arxiv.org/abs/1901.03035
  • 代码: https://github.com/chihyaoma/selfmonitoring-agent
  • 主页链接: https://chihyaoma.github.io/project/2018/09/27/selfmonitoring.html

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

  • 会议/年份: CVPR, 2019
  • 论文链接: https://arxiv.org/abs/1811.10092

Speaker-Follower Models for Vision-and-Language Navigation

  • 会议/年份: NeurIPS, 2018
  • 论文链接: https://arxiv.org/abs/1806.02724
  • 代码: https://github.com/ronghanghu/speaker_follower
  • 主页链接: http://ronghanghu.com/speaker_follower/

Mapping Instructions to Actions in 3D Environmentswith Visual Goal Prediction

  • 会议/年份: EMNLP, 2018
  • 论文链接: https://arxiv.org/abs/1809.00786

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation

  • 会议/年份: ECCV, 2018
  • 论文链接: https://arxiv.org/abs/1803.07729

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments

  • 会议/年份: CVPR, 2018
  • 论文链接: https://arxiv.org/abs/1711.07280
  • 代码: https://github.com/peteanderson80/Matterport3DSimulator
  • 主页链接: https://bringmeaspoon.org/

更多内容欢迎加入国内首个具身智能全栈学习社区:具身智能之心知识星球!

资讯配图


声明:内容取材于网络,仅代表作者观点,如有内容违规问题,请联系处理。 
导航
more
ICCV25满分论文| MTU3D统一空间理解与主动探索的具身导航
机器人定位与导航学习路线图
正式开课啦!具身智能目标导航算法与实战教程来了~
具身目标导航/视觉语言导航/点导航工作汇总!
哈工大提出UAV-ON:面向空中智能体的开放世界目标导航基准测试
X-Nav:端到端跨平台导航框架,通用策略实现零样本迁移
(备胎视频)导航的8个隐藏功能,你知道几个
全文+图解 | 河南省三部门发文:加强低空经济等前沿科技,加快发展低空通信、导航、监测等基础设施,拓展无人机等应用场景
无界智慧招募操作算法、导航算法、运动控制等方向(社招+实习)
登顶 ICCV 2025!清华大学提出统一具身智能导航框架:主动感知、三维视觉-语言理解
Copyright © 2025 成都区角科技有限公司
蜀ICP备2025143415号-1
  
川公网安备51015602001305号