运筹学学报 ›› 2023, Vol. 27 ›› Issue (4): 106-135.doi: 10.15960/j.cnki.issn.1007-6093.2023.04.006

•   • 上一篇    下一篇

机器学习驱动的多智能体路径搜寻算法综述

王祥丰1,*(), 李文浩2   

  1. 1. 华东师范大学计算机科学与技术学院, 数学与工程应用教育部重点实验室, 上海 200062
    2. 香港中文大学(深圳) 数据科学学院, 深圳市人工智能与智能体研究院, 广东深圳 518172
  • 收稿日期:2023-05-04 出版日期:2023-12-15 发布日期:2023-12-07
  • 通讯作者: 王祥丰 E-mail:xfwang@sei.ecnu.edu.cn
  • 作者简介:王祥丰, E-mail: xfwang@sei.ecnu.edu.cn
  • 基金资助:
    国家重点研发计划(2021YFA1000300);国家重点研发计划(2021YFA1000302);国家自然科学基金(12071145);国家自然科学基金(11971216);国家自然科学基金(62072222);中国博士后科学基金(2022M723039)

Machine learning-driven multi-agent pathfinding: An overview

Xiangfeng WANG1,*(), Wenhao LI2   

  1. 1. School of Computer Science and Engineering, East China Normal University, Key Laboratory of Ministry of Education, Shanghai 200062, China
    2. School of Data Science, The Chinese University of Hong Kong, Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen 518172, Guangdong, China
  • Received:2023-05-04 Online:2023-12-15 Published:2023-12-07
  • Contact: Xiangfeng WANG E-mail:xfwang@sei.ecnu.edu.cn

摘要:

多智能体路径搜寻(Multi-agent Path Finding, MAPF)问题是多智能体系统中的核心基本问题, 被广泛应用于自动化智能仓储、自动驾驶、群体机器人等实际场景。从问题属性来看, 其关键难点在于多个智能体能够同时沿着路径行驶, 同时保证不发生碰撞, 属于NP-难组合优化问题。然而, 上述现实世界应用需要算法能够在较短的计算时间内为大量智能体搜索高质量的无碰撞路径, 更短的路径将导致更高的系统吞吐量以及更低的操作成本, 给经典MAPF运筹算法带来了极大挑战。因此, 近年来大量工作开始聚焦于使用机器学习方法赋能多智能体路径搜寻问题的研究, 以期加快求解速度、提升求解质量。本文综述内容包括三部分, 包括MAPF问题的核心概念、优化目标、基准任务, 经典MAPF算法的问题建模、核心思想、算法优劣, 并从机器学习赋能程度从低到高分别介绍一系列机器学习赋能的MAPF算法, 并给出具体的示意图和伪代码。本文还总结了机器学习驱动的多智能体路径搜寻算法目前面临的主要挑战, 并提出未来潜在研究方向, 以期可帮助领域内的研究者, 并促进机器学习方法在经典多智能体路径搜寻领域的发展。

关键词: 机器学习, 多智能体路径搜寻, 多智能体强化学习

Abstract:

The Multi-agent Path Finding (MAPF) problem is a core fundamental issue in multi-agent systems and has been widely applied in practical scenarios such as automated intelligent warehousing, autonomous driving, and swarm robotics. From the perspective of problem attributes, the key difficulty lies in enabling multiple intelligent agents to travel along paths simultaneously while ensuring no collisions occur, making it an NP-hard combinatorial optimization problem. However, the aforementioned realworld applications require algorithms to find high-quality, collision-free paths for a large number of intelligent agents within a short computation time. Shorter paths lead to higher system throughput and lower operating costs, presenting significant challenges for classical MAPF optimization algorithms. As a result, in recent years, numerous studies have begun focusing on using machine learning methods to empower MAPF research, aiming to accelerate solution speed and improve solution quality. This paper presents a comprehensive review in three parts, including the core concepts, optimization objectives, and benchmark tasks of the MAPF problem. It also covers the problem modeling, core ideas, and strengths and weaknesses of traditional MAPF algorithms. Additionally and most importantly, this paper introduces a series of machine learning-empowered MAPF algorithms with varying degrees of machine learning involvement, providing corresponding schematic diagrams and pseudocode. This paper further summarizes the major challenges currently faced by machine learning-driven MAPF algorithms and proposes potential future research directions. This is intended to assist researchers in the field and promote the development of machine learning methods in the classical MAPF domain.

Key words: machine learning, multi-agent pathfinding, multi-agent reinforcement learning

中图分类号: