运筹学学报(中英文) ›› 2024, Vol. 28 ›› Issue (3): 46-62.doi: 10.15960/j.cnki.issn.1007-6093.2024.03.003
收稿日期:
2024-04-10
出版日期:
2024-09-15
发布日期:
2024-09-07
通讯作者:
戴照鹏
E-mail:dzpeng@amss.ac.cn
基金资助:
Hongwei GAO1, Binbin MENG2, Jian LIU3, Zhaopeng DAI1,*()
Received:
2024-04-10
Online:
2024-09-15
Published:
2024-09-07
Contact:
Zhaopeng DAI
E-mail:dzpeng@amss.ac.cn
摘要:
本文以微分博弈和经典的追逃问题为主线, 对群体追逃微分博弈的历史发展脉络进行梳理。针对大规模群体追逃问题, 从平均场博弈视角出发, 阐释了强化学习技术的应用前景。提出探索解决逆向追逃微分博弈的观点, 可适用于水下无人舰艇、陆地机器人以及空中无人机集群等同类场景。区别于其他综述性文章, 作者对于俄罗斯以及苏联在本领域发展历史中代表性的学术流派给予了较多关注。
中图分类号:
高红伟, 孟斌斌, 刘剑, 戴照鹏. 群体追逃微分博弈[J]. 运筹学学报(中英文), 2024, 28(3): 46-62.
Hongwei GAO, Binbin MENG, Jian LIU, Zhaopeng DAI. Group pursuit-evasion differential games[J]. Operations Research Transactions, 2024, 28(3): 46-62.
1 | Isaacs R. Games of pursuit[R]. Santa Monica: RAND Corporation, 1951: 257. |
2 | 高红伟, [俄] 彼得罗相. 动态合作博弈[M]. 北京: 科学出版社, 2009. |
3 | Isaacs R . Differential Games[M]. New York: John Wiley and Sons, 1965. |
4 | Bellman R . Dynamic Programming[M]. Princeton: Princeton University Press, 1957. |
5 | Pontryagin L S , Boltyanskii V G , Gamkrelidze R V , et al. The Mathematical Theory of Optimal Processes[M]. New York: Interscience Publishers, 1962. |
6 | Bernhard P . Singular surfaces in differential games an introduction[J]. Lecture Notes in Control and Information Sciences, 1977, 3, 1- 33. |
7 | Melikyan A. Generalized Characteristics of First Order PDEs: Applications in Optimal Control and Differential Games[M]. Springer Science & Business Media, 1998. |
8 | Lewin J . Differential Games[M]. London: Springer, 1994. |
9 | Breakwell J V, Merz A W. Toward a complete solution of the homicidal chauffeur game[C]//Proceedings of the First International Conference on the Theory and Applications of Differential Games, 1969: Ⅲ-1-Ⅲ-5. |
10 | Merz A W. The homicidal chauffeur-a differential game[D]. Stanford: Stanford University, 1971. |
11 |
Merz A W . The homicidal chauffeur[J]. AIAA Journal, 1974, 12 (3): 259- 260.
doi: 10.2514/3.49215 |
12 |
Patsko V S , Turova V L . Antony merz and his works[J]. Dynamic Games and Applications, 2020, 10 (1): 157- 182.
doi: 10.1007/s13235-019-00318-y |
13 |
Petrosyan L A , Yeung D W K , Parilina E M . Mathematical game theory at St. Petersburg State University[J]. International Game Theory Review, 2024, 26 (1): 2350019.
doi: 10.1142/S0219198923500196 |
14 | 中国科学院数学与系统科学研究院编. 吴文俊全集教材卷Ⅰ-博弈论讲义[M]. 北京: 科学出版社, 2023. |
15 | Petrosyan L A . Stable solutions of differential games with many participants[J]. Viestnik of Leningrad University, 1977a, 19, 46- 52. |
16 | Pontryagin L S . Linear differential games, 1[J]. Doklady Akademii Nauk SSSR, 1967a, 174 (6): 1278- 1280. |
17 | Pontryagin L S . Linear differential games, 2[J]. Doklady Akademii Nauk SSSR, 1967b, 175 (4): 764- 766. |
18 | Pontryagin L S , Mishchenko E F . The problem of evasion in linear differential games[J]. Differentsial'nye Uravneniya, 1971, 7 (3): 436- 445. |
19 | Pontryagin L S . A linear differential evasion game[J]. Trudy Mat Inst Akademii Nauk SSSR, 1971, 112, 27- 60. |
20 | Krasovskii N N . Game Problems on the Encounter of Motions[M]. Moscow: Nauka, 1970. |
21 | Krasovskii N N , Subbotin A I . Positional Differential Games[M]. Moscow: Nauka, 1974. |
22 | Krasovskii N N , Subbotin A I , Kotz S . Game-Theoretical Control Problems[M]. New York: Springer-Verlag, 1988. |
23 | Subbotin A I , Patsko V S (eds) . Algorithms and programs for solving linear differential games[J]. Institute of Mathematics and Mechanics, Ural Scientific Center, Academy of Sciences of USSR, 1984, 127- 158. |
24 |
Taras'yev A M , Ushakov V N , Khripunov A P . On a computational algorithm for solving game control problems[J]. Journal of Applied Mathematics and Mechanics, 1987, 51 (2): 167- 172.
doi: 10.1016/0021-8928(87)90059-1 |
25 | Ushakov V N . Construction of solutions in differential games of pursuit-evasion[J]. Lecture Notes in Nonlinear Analysis, 1998, 2, 269- 281. |
26 | Petrosyan L A . A family of differential survival games in the space $\mathbb{R}.n$[J]. Doklady Akademii Nauk SSSR, 1965, 161 (1): 52- 54. |
27 | Petrosyan L A . Differential Pursuit Games[M]. Leningrad: Leningrad State University, 1977b. |
28 | Petrosyan L A . Differential Games of Pursuit[M]. London: World Scientific, 1993. |
29 | Petrosyan L A . "Life-line" pursuit games with several players[J]. Izvestija Akademii Nauk Armjansko$\breve i$ SSR Serija Matematika, 1966, 1 (5): 331- 340. |
30 | Petrosyan L A , Shiryaev V D . Group pursuit game with several evaders by one pursuer[J]. Vestnik LGU, 1980, 13 (3): 50- 57. |
31 |
Reeds J , Shepp L . Optimal paths for a car that goes both forwards and backwards[J]. Pacific Journal of Mathematics, 1990, 145 (2): 367- 393.
doi: 10.2140/pjm.1990.145.367 |
32 |
Patsko V S , Fedotov A A . Analytic description of a reachable set for the Dubins car[J]. Trudy Inst Mat Mekh UrO RAN, 2020, 26 (1): 182- 197.
doi: 10.21538/0134-4889-2020-26-1-182-197 |
33 |
Buzikov M , Galyaev A . The game of two identical cars: An analytical description of the barrier[J]. Journal of Optimization Theory and Applications, 2023, 198 (3): 988- 1018.
doi: 10.1007/s10957-023-02278-1 |
34 |
Merz A W . The game of two identical cars[J]. Journal of Optimization Theory and Applications, 1972, 9, 324- 343.
doi: 10.1007/BF00932932 |
35 | Chernousko F L , Melikyan A A . Game Problems of Control and Search[M]. Moscow: Nauka, 1978. |
36 | Kurzhanskii A B . Control and Observation under Uncertainty Conditions[M]. Moscow: Nauka, 1977. |
37 |
Kurzhanskii A B . The problem of measurement feedback control[J]. Journal of Applied Mathematics and Mechanics, 2004, 68 (4): 487- 501.
doi: 10.1016/j.jappmathmech.2004.07.002 |
38 |
Osipov Y S . Control packages: An approach to solution of positional control problems with incomplete information[J]. Russian Mathematical Surveys, 2006, 61 (4): 611- 661.
doi: 10.1070/RM2006v061n04ABEH004342 |
39 | Kryazhimskiy A V , Osipov Y S . Idealized program packages and problems of positional control with incomplete information[J]. Proceedings of the Steklov Institute of Mathematics, 2010, 268 (1): 155- 174. |
40 |
Ushakov V N , Ukhobotov V I , Lipin A E . An addition to the definition of a stable bridge and an approximating system of sets in differential games[J]. Proceedings of the Steklov Institute of Mathematics, 2019, 304, 268- 280.
doi: 10.1134/S0081543819010206 |
41 | Chernousko F L, Melikyan A A. Some differential games with incomplete information[M]//Optimization Techniques IFIP Technical Conference, Berlin: Springer, 1975: 445-450. |
42 |
Bernhard P , Pourtallier O . Pursuit evasion game with costly information[J]. Dynamics and Control, 1994, 4 (4): 365- 382.
doi: 10.1007/BF01974141 |
43 | Neveu D, Pignon J, Raimondo A, et al. Pursuit games with costly information: Application to the ASW helicopter versus submarine game [M]//New Trends in Dynamic Games and Applications, Boston: Birkhäuser, 1995: 247-257. |
44 | Olsder G J, Pourtallier O. Optimal selection of observation times in a costly information game [M]//New Trends in Dynamic Games and Applications, Boston: Birkhäuser, 1995: 227-246. |
45 |
Miele A , Wang T , Melvin W W . Optimal take-off trajectories in the presence of windshear[J]. Journal of Optimization Theory and Applications, 1986, 49 (1): 1- 45.
doi: 10.1007/BF00939246 |
46 |
Miele A , Wang T , Tzeng C Y , et al. Optimal abort landing trajectories in the presence of windshear[J]. Journal of Optimization Theory and Applications, 1987, 55 (2): 165- 202.
doi: 10.1007/BF00939080 |
47 |
Miele A , Wang T , Wang H , et al. Optimal penetration landing trajectories in the presence of windshear[J]. Journal of Optimization Theory and Applications, 1988, 57 (1): 1- 40.
doi: 10.1007/BF00939327 |
48 |
Leitmann G , Pandey S . Aircraft control for flight in an uncertain environment: Takeoff in windshear[J]. Journal of Optimization Theory and Applications, 1991, 70 (1): 25- 55.
doi: 10.1007/BF00940503 |
49 |
Bulirsch R , Montrone F , Pesch H J . Abort landing in the presence of windshear as a minimax optimal control problem, part 1: Necessary conditions[J]. Journal of Optimization Theory and Applications, 1991a, 70 (1): 1- 23.
doi: 10.1007/BF00940502 |
50 |
Bulirsch R , Montrone F , Pesch H J . Abort landing in the presence of windshear as a minimax optimal control problem, part 2: Multiple shooting and homotopy[J]. Journal of Optimization Theory and Applications, 1991b, 70 (2): 223- 254.
doi: 10.1007/BF00940625 |
51 |
Botkin N D , Kein V M , Patsko V S . The model problem of controlling the lateral motion of an aircraft during landing[J]. Journal of Applied Mathematics and Mechanics, 1984, 48 (4): 395- 400.
doi: 10.1016/0021-8928(84)90004-2 |
52 |
Patsko V S , Botkin N D , Kein V M , et al. Control of an aircraft landing in windshear[J]. Journal of Optimization Theory and Applications, 1994, 83 (2): 237- 267.
doi: 10.1007/BF02190056 |
53 |
Sun W , Tsiotras P , Yezzi A J . Multiplayer pursuit-evasion games in three-dimensional flow fields[J]. Dynamic Games and Applications, 2019, 9 (4): 1188- 1207.
doi: 10.1007/s13235-019-00304-4 |
54 |
Botkin N D , Martynov K , Turova V L , et al. Generation of dangerous disturbances for flight systems[J]. Dynamic Games and Applications, 2019, 9 (3): 628- 651.
doi: 10.1007/s13235-018-0259-5 |
55 |
Shaferman V , Shima T . Cooperative multiple-model adaptive guidance for an aircraft defending missile[J]. Journal of Guidance, Control, and Dynamics, 2010, 33 (6): 1801- 1813.
doi: 10.2514/1.49515 |
56 |
Shima T . Optimal cooperative pursuit and evasion strategies against a homing missile[J]. Journal of Guidance, Control, and Dynamics, 2011, 34 (2): 414- 425.
doi: 10.2514/1.51765 |
57 | Pachter M, Garcia E, Casbeer D W. Active target defense differential game[C]//201452nd Annual Allerton Conference on Communication, Control, and Computing, IEEE, 2014: 46-53. |
58 |
Garcia E , Casbeer D W , Pachter M . Pursuit in the presence of a defender[J]. Dynamic Games and Applications, 2019, 9 (3): 652- 670.
doi: 10.1007/s13235-018-0271-9 |
59 |
Rubinovich E . Missile-target-defender problem with incomplete a priori information[J]. Dynamic Games and Applications, 2019, 9 (3): 851- 857.
doi: 10.1007/s13235-019-00297-0 |
60 |
Pachter M , Garcia E , Casbeer D W . Toward a solution of the active target defense differential game[J]. Dynamic Games and Applications, 2019, 9 (1): 165- 216.
doi: 10.1007/s13235-018-0250-1 |
61 | Abramyants T G , Maslov E P , Rubinovich E Y . A simplest differential game of alternate pursuit[J]. Automation and Remote Control, 1980, 41 (8): 1043- 1052. |
62 |
Abramyants T G , Maslov E P , Yakhno V P . Evasion from detection in the three-dimensional space[J]. Journal of Computer and Systems Sciences International, 2007, 46 (5): 675- 680.
doi: 10.1134/S1064230707050012 |
63 |
Shevchenko I . Successive pursuit with a bounded detection domain[J]. Journal of Optimization Theory and Applications, 1997, 95 (1): 25- 48.
doi: 10.1023/A:1022679210961 |
64 | Kim D P . Methods of Search and Pursuit of Mobile Objects[M]. Moscow: Nauka, 1993. |
65 | Petrosyan L A , Garnaev A Y . Search Games[M]. Saint Petersburg: Saint Petersburg University Press, 1992. |
66 |
Crandall M G , Evans L C , Lions P L . Some properties of viscosity solutions of Hamilton-Jacobi equations[J]. Transactions of the American Mathematical Society, 1984, 282 (2): 487- 502.
doi: 10.1090/S0002-9947-1984-0732102-X |
67 |
Crandall M G , Lions P L . Viscosity solutions of Hamilton-Jacobi equations[J]. Transactions of the American Mathematical Society, 1983, 277 (1): 1- 42.
doi: 10.1090/S0002-9947-1983-0690039-8 |
68 | Lions P L . Generalized Solutions of Hamilton-Jacobi Equations[M]. London: Pitman, 1982. |
69 | Subbotin A I . Generalized Solutions of First-Order PDEs: The Dynamical Optimization Perspective[M]. Boston: Birkhäuser, 1995. |
70 |
Botkin N D , Hoffmann K H , Turova V L . Stable numerical schemes for solving Hamilton-Jacobi-Bellman-Isaacs equations[J]. SIAM Journal on Scientific Computing, 2011, 33 (2): 992- 1007.
doi: 10.1137/100801068 |
71 | Chen M, Fisac J F, Sastry S, et al. Safe sequential path planning of multi-vehicle systems via double-obstacle Hamilton-Jacobi-Isaacs variational inequality[C]//Proceedings of the 14th European Control Conference, IEEE, 2015: 3304-3309. |
72 |
Falcone M . Numerical methods for differential games based on partial differential equations[J]. International Game Theory Review, 2006, 8 (2): 231- 272.
doi: 10.1142/S0219198906000886 |
73 |
Barron E N . Reach-avoid differential games with targets and obstacles depending on controls[J]. Dynamic Games and Applications, 2018, 8 (4): 696- 712.
doi: 10.1007/s13235-017-0235-5 |
74 |
Hagedorn P , Breakwell J V . A differential game with two pursuers and one evader[J]. Journal of Optimization Theory and Applications, 1976, 18 (1): 15- 29.
doi: 10.1007/BF00933791 |
75 |
Breakwell J V , Hagedorn P . Point capture of two evaders in succession[J]. Journal of Optimization Theory and Applications, 1979, 27 (1): 89- 97.
doi: 10.1007/BF00933327 |
76 |
Pshenichnyi B N . Simple pursuit by several objects[J]. Cybernetics, 1976, 12 (3): 484- 485.
doi: 10.1007/BF01070036 |
77 |
Chernousko F L . A problem of evasion from many pursuers[J]. Journal of Applied Mathematics and Mechanics, 1976, 40 (1): 11- 20.
doi: 10.1016/0021-8928(76)90105-2 |
78 |
Kumkov S S , Le Ménec S , Patsko VS . Zero-sum pursuit-evasion differential games with many objects: Survey of publications[J]. Dynamic Games and Applications, 2017, 7 (4): 609- 633.
doi: 10.1007/s13235-016-0209-z |
79 |
Katz I N , Mukai H , Schüttler H , et al. Solution of a differential game formulation of military air operations by the method of characteristics[J]. Journal of Optimization Theory and Applications, 2005, 125, 113- 135.
doi: 10.1007/s10957-004-1713-7 |
80 | Rusnak I . The lady, the bandits and the body guards——a two team dynamic game[J]. IFAC Proceedings Volumes, 2005, 38 (1): 441- 446. |
81 |
Zhou Z J , Xu H . Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning[J]. Neurocomputing, 2022, 484, 46- 58.
doi: 10.1016/j.neucom.2021.01.141 |
82 |
Petrov N N , Solov'eva N A . Capture of given number of evaders in pontryagin's nonstationary example[J]. Dynamic Games and Applications, 2019, 9 (3): 614- 627.
doi: 10.1007/s13235-019-00303-5 |
83 |
Petrov N N . The problem of simple group pursuit with phase constraints in time scales[J]. Vestnik Udmurtskogo Universiteta Matematika Mekhanika Komp Yuternye Nauki, 2020, 30 (2): 249- 258.
doi: 10.35634/vm200208 |
84 |
Petrov N N , Machtakova A I . Capture of two coordinated evaders in a problem with fractional derivatives, phase restrictions and a simple matrix[J]. Izvestiya Instituta Matematikii Informatiki Udmurtskogo Gosudarstvennogo Universiteta, 2020, 56, 50- 62.
doi: 10.35634/2226-3594-2020-56-05 |
85 |
Petrov N N , Shuravina I N . On the "soft" capture in one group pursuit problem[J]. Journal of Computer and Systems Sciences International, 2009, 48 (4): 521- 526.
doi: 10.1134/S1064230709040042 |
86 |
Bopardikar S D , Bullo F , Hespanha J P . A cooperative homicidal chauffeur game[J]. Automatica, 2009, 45 (7): 1771- 1777.
doi: 10.1016/j.automatica.2009.03.014 |
87 |
Ibragimov G , Ferrara M , Kuchkarov A , et al. Simple motion evasion differential game of many pursuers and evaders with integral constraints[J]. Dynamic Games and Applications, 2018, 8 (2): 352- 378.
doi: 10.1007/s13235-017-0226-6 |
88 | Ge J, Tang L, Reimann J, et al. Suboptimal approaches to multiplayer pursuit-evasion differential games[C]//AIAA Guidance, Navigation, and Control Conference and Exhibit, 2006: 6786. |
89 |
Sun W , Tsiotras P . Sequential pursuit of multiple targets under external disturbances via Zermelo-Voronoi diagrams[J]. Automatica, 2017, 81, 253- 260.
doi: 10.1016/j.automatica.2017.03.015 |
90 |
Makkapati V R , Tsiotras P . Optimal evading strategies and task allocation in multi-player pursuit-evasion problems[J]. Dynamic Games and Applications, 2019, 9 (4): 1168- 1187.
doi: 10.1007/s13235-019-00319-x |
91 | Kurzhanskii A B . On a team control problem under obstacles[J]. Proceedings of the Steklov Institute of Mathematics, 2015, 291 (1): 128- 142. |
92 | Kurzhanskii A B . Problem of collision avoidance for a team motion with obstacles[J]. Proceedings of the Steklov Institute of Mathematics, 2016, 293 (1): 120- 136. |
93 |
Grigorenko N L . Simple pursuit-evasion game with a group of pursuers and one evader[J]. Vestnik Moskovskogo Universiteta. Seriya XV. Vychislitel![]() |
94 | Blagodatskikh A I . Simultaneous multiple capture in a simple pursuit problem[J]. Journal of Applied Mathematics and Mechanics, 2009, 73 (1): 36- 40. |
95 | Blagodatskikh A I . Simultaneous multiple capture in a conflict-controlled process[J]. Journal of Applied Mathematics and Mechanics, 2013, 77 (3): 314- 320. |
96 | Blagodatskikh A I . Multiple capture of rigidly coordinated evaders[J]. Vestnik Udmurtskogo Universiteta. Matematika. Mekhanika. Komp'yuternye Nauki, 2016, 26 (1): 46- 57. |
97 | Bakolas E , Tsiotras P . The Zermelo-Voronoi diagram: A dynamic partition problem[J]. Automatica, 2010, 46 (12): 2059- 2067. |
98 | Bakolas E , Tsiotras P . Relay pursuit of a maneuvering target using dynamic Voronoi diagrams[J]. Automatica, 2012, 48 (9): 2213- 2220. |
99 | Von Moll A, Casbeer D W, Garcia E, et al. Pursuit-evasion of an evader by multiple pursuers[C]//2018 International Conference on Unmanned Aircraft Systems, IEEE, 2018: 133-142. |
100 | Von Moll A , Casbeer D , Garcia E , et al. The multi-pursuer single-evader game: A geometric approach[J]. Journal of Intelligent & Robotic Systems, 2019, 96, 193- 207. |
101 | Awheda M D , Schwartz H M . A decentralized fuzzy learning algorithm for pursuit-evasion differential games with superior evaders[J]. Journal of Intelligent & Robotic Systems, 2016, 83, 35- 53. |
102 | Al-Talabi A A. Multi-player pursuit-evasion differential game with equal speed[C]//2017 International Automatic Control Conference, IEEE, 2017: 1-6. |
103 | Mitchell I M. Application of level set methods to control and reachability problems in continuous and hybrid systems [D]. Stanford: Stanford University, 2002. |
104 | Raivio T, Ehtamo H. On the numerical solution of a class of pursuit-evasion games [M]//Advances in Dynamic Games and Applications, Boston: Birkhäuser, 2000: 177-192. |
105 | Meyer A, Breitner M H, Kriesell M. A pictured memorandum on synthesis phenomena occurring in the homicidal chauffeur game[C]//Proceedings of the 5th International ISDG Workshop, Segovia, 2005: 17-32. |
106 | Mikhalev D K , Ushakov V N . Two algorithms for approximate construction of the set of positional absorption in the game problem of pursuit[J]. Automation and Remote Control, 2007, 68 (11): 2056- 2070. |
107 | Botkin N D, Hoffmann K H, Mayer N, et al. Computation of value functions in nonlinear differential games with state constraints[C]//Proceedings of the 25th IFIP TC7 Conference on System Modeling and Optimization, 2013: 235-244. |
108 | Li D, Cruz J B, Chen G, et al. A hierarchical approach to multi-player pursuit-evasion differential games[C]//Proceedings of the 44th IEEE Conference on Decision and Control, 2005: 5674-5679. |
109 | Jin S, Qu Z. A heuristic task scheduling for multi-pursuer multi-evader games[C]//IEEE International Conference on Information and Automation, 2011: 528-533. |
110 | Margellos K , Lygeros J . Hamilton-Jacobi formulation for reach-avoid differential games[J]. IEEE Transactions on Automatic Control, 2011, 56 (8): 1849- 1861. |
111 | Chen M , Zhou Z , Tomlin C J . Multiplayer reach-avoid games via pairwise outcomes[J]. IEEE Transactions on Automatic Control, 2016, 62 (3): 1451- 1457. |
112 | Zhou Z , Ding J , Huang H , et al. Efficient path planning algorithms in reach-avoid problems[J]. Automatica, 2018, 89, 28- 36. |
113 | Fisac J F, Sastry S S. The pursuit-evasion-defense differential game in dynamic constrained environments[C]//Proceedings of the 54th IEEE Conference on Decision and Control, IEEE, 2015: 4549-4556. |
114 | Castelvecchi D . DeepMind's AI helps untangle the mathematics of knots[J]. Nature, 2021, 600 (7888): 202- 202. |
115 | De Souza C , Newbury R , Cosgun A , et al. Decentralized multi-agent pursuit using deep reinforcement learning[J]. IEEE Robotics and Automation Letters, 2021, 6 (3): 4552- 4559. |
116 | Yang H, Ge P, Cao J, et al. Large scale pursuit-evasion under collision avoidance using deep reinforcement learning[C]//2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, 2023: 2232-2239. |
117 | Fujimoto S, Hoof H, Meger D. Addressing function approximation error in actor-critic methods[C]//International Conference on Machine Learning, 2018: 1587-1596. |
118 | Wang Y , Dong L , Sun C . Cooperative control for multi-player pursuit-evasion games with reinforcement learning[J]. Neurocomputing, 2020, 412, 101- 114. |
119 | Singh G, Lofaro D M, Sofge D. Pursuit-evasion with decentralized robotic swarm in continuous state space and action space via deep reinforcement learning[C]// Proceedings of the 12th International Conference on Agents and Artificial Intelligence, 2020, 1: 226-233. |
120 | Lowe R, Wu Y I, Tamar A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments [M]//Advances in Neural Information Processing Systems, 2017: 6382-6393. |
121 | Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous control with deep reinforcement learning[EB/OL]. (2019-06-05)[2024-03-28]. https://arxiv.org/abs/1509.02971. |
122 | Sutton R S , Barto A G . Introduction to Reinforcement Learning[M]. Cambridge: MIT press, 1998. |
123 | Selvakumar J , Bakolas E . Min-max Q-learning for multi-player pursuit-evasion games[J]. Neurocomputing, 2022, 475, 1- 14. |
124 | Hu P , Pan Q , Zhao C , et al. Transfer reinforcement learning for multi-agent pursuit-evasion differential game with obstacles in a continuous environment[J]. Asian Journal of Control, 2024, 1- 16. |
125 | Huang X . StarCraft adversary-agent challenge for pursuit-evasion game[J]. Journal of the Franklin Institute, 2023, 360 (15): 10893- 10916. |
126 | Lanctot M, Lockhart E, Lespiau JB, et al. OpenSpiel: A framework for reinforcement learning in games[EB/OL]. (2020-09-26)[2024-03-28]. https://arxiv.org/abs/1908.09453v3. |
127 | Katsev M , Yershova A , Tovar B , et al. Mapping and pursuit-evasion strategies for a simple wall-following robot[J]. IEEE Transactions on Robotics, 2011, 27 (1): 113- 128. |
128 | Busoniu L , Babuška R , De Schutter B . Multi-agent reinforcement learning: An overview[J]. Innovations in Multi-Agent Systems and Applications-1, 2010, 183- 221. |
129 | Panait L , Luke S . Cooperative multi-agent learning: The state of the art[J]. Autonomous Agents and Multi-Agent Systems, 2005, 11, 387- 434. |
130 | Gupta J K, Egorov M, Kochenderfer M. Cooperative multi-agent control using deep reinforcement learning[C]//Autonomous Agents and Multiagent Systems: AAMAS 2017 Workshops, 2017: 66-83. |
131 | Lyu Y , Ren X , Na J . Adaptive optimal tracking controls of unknown multi-input systems based on nonzero-sum game theory[J]. Journal of the Franklin Institute, 2019, 356 (15): 8255- 8277. |
132 | Guéant O, Lasry J M, Lions P L. Mean field games and applications [M]//Paris-Princeton Lectures on Mathematical Finance 2010, Berlin: Springer, 2011: 205-266. |
133 | Lasry J M , Lions P L . Mean field games[J]. Japanese Journal of Mathematics, 2007, 2 (1): 229- 260. |
134 | Zhou Z, Xu H. Mean field game and decentralized intelligent adaptive pursuit evasion strategy for massive multi-agent system under uncertain environment[C]//2020 American Control Conference, 2020: 5382-5387. |
135 | Han J , Jentzen A , Weinan E . Solving high-dimensional partial differential equations using deep learning[J]. Proceedings of the National Academy of Sciences, 2018, 115 (34): 8505- 8510. |
136 | Ren L , Jin Y X , Niu Z J , et al. Optimal strategies for large-scale pursuers against one evader: A mean field game-based hierarchical control approach[J]. Systems & Control Letters, 2024, 183, 105697. |
137 | Wang G, Yao W, Zhang X, et al. Coupled alternating neural networks for solving multi-population high-dimensional mean-field games with stochasticity[EB/OL]. (2022-01-28)[2024-03-28]. https://www.techrxiv.org/doi/full/10.36227/techrxiv.19009463.v1. |
138 | Uz Zaman M A , Miehling E , Basar T . Reinforcement learning for non-stationary discrete-time linear-quadratic mean-field games in multiple populations[J]. Dynamic Games and Applications, 2023, 13 (1): 118- 164. |
139 | Kamimura A , Ohira T . Group Chase and Escape: Fusion of Pursuits-Escapes and Collective Motions[M]. Berlin: Springer, 2019. |
140 | Stocco GF, Cybenko G. Inverse game theory: Learning the nature of a game through play[C]//Carapezza EM, editor, Sensors, and Command, Control, Communications, and Intelligence Technologies for Homeland Security and Homeland Defense XI, 2012: 835905. |
141 | Russell S. Learning agents for uncertain environments[C]//Proceedings of the Eleventh Annual Conference on Computational Learning Theory, 1998: 101-103. |
142 | Ng A Y, Russell S. Algorithms for inverse reinforcement learning[C]//Proceedings of the 17th International Conference on Machine Learning, 2000: 663-670. |
143 | Liu Y , Alsaleh R , Sayed T . Modelling motorized and non-motorized vehicle conflicts using multiagent inverse reinforcement learning approach[J]. Transportmetrica B: Transport Dynamics, 2024, 12 (1): 2314762. |
144 | Xiang G , Li S , Shuang F , et al. SC-AIRL: Share-critic in adversarial inverse reinforcement learning for long-horizon task[J]. IEEE Robotics and Automation Letters, 2024, 9 (4): 3179- 3186. |
[1] | 王祥丰, 李文浩. 机器学习驱动的多智能体路径搜寻算法综述[J]. 运筹学学报, 2023, 27(4): 106-135. |
[2] | 华贇, 王祥丰, 金博. 面向城市交通信号优化的多智能体强化学习综述[J]. 运筹学学报, 2023, 27(2): 49-62. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||