Operations Research Transactions ›› 2023, Vol. 27 ›› Issue (2): 49-62.doi: 10.15960/j.cnki.issn.1007-6093.2023.02.003
Previous Articles Next Articles
Yun HUA1, Xiangfeng WANG1,*(), Bo JIN1
Received:
2022-06-12
Online:
2023-06-15
Published:
2023-06-13
Contact:
Xiangfeng WANG
E-mail:xfwang@cs.ecnu.edu.cn
CLC Number:
Yun HUA, Xiangfeng WANG, Bo JIN. Multi-agent deep reinforcement learning-based urban traffic signal management[J]. Operations Research Transactions, 2023, 27(2): 49-62.
1 | 周大可, 唐慕尧, 杨欣. 一种结合状态预测的深度强化学习交通信号控制方法[J]. 计算机应用研究, 2022, 39 (8): 2311- 2315. |
2 | Lin Y, Wang P, Ma M. Intelligent transportation system (ITS): Concept, challenge and opportunity[C]//International Conference on Big Data Security on Cloud, 2017. |
3 |
Dion F , Rakha H , Kang Y . Comparison of delay estimates at under-saturated and over-saturated pre-timed signalized intersections[J]. Transportation Research Part B: Methodological, 2004, 38 (2): 99- 122.
doi: 10.1016/S0191-2615(03)00003-1 |
4 |
Porche I , Lafortune S . Adaptive look-ahead optimization of traffic signals[J]. Journal of Intelligent Transportation System, 1999, 4 (3-4): 209- 254.
doi: 10.1080/10248079908903749 |
5 | Wei H, Zheng G, Yao H, Li Z. IntelliLight: A reinforcement learning approach for intelligent traffic light control[C]//SIGKDD, 2018. |
6 | Li W, Chen H, Jin B, et al. Multi-agent path finding with prioritized communication learning[C]//ICRA, 2022. |
7 |
Kober J , Bagnell J , Peters J . Reinforcement learning in robotics: A survey[J]. The International Journal of Robotics Research, 2013, 32 (11): 1238- 1274.
doi: 10.1177/0278364913495721 |
8 | Lample G, Chaplot D. Playing FPS games with deep reinforcement learning[C]//AAAI, 2017. |
9 | Palanisamy P. Multi-agent connected autonomous driving using deep reinforcement learning[C]//IJCNN, 2020. |
10 | Abdoos M, Mozayani N, Bazzan A. Traffic light control in non-stationary environments based on multi agent Q-learning[C]//ITSC, 2011. |
11 |
Balaji P , German X , Srinivasan D . Urban traffic signal control using reinforcement learning agents[J]. IET Intelligent Transport Systems, 2010, 4 (3): 177- 188.
doi: 10.1049/iet-its.2009.0096 |
12 | Chu T , Wang J , Codecà L , et al. Multi-agent deep reinforcement learning for large-scale traffic signal control[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 21 (3): 1086- 1095. |
13 | Mannion P, Duggan J, Howley E. An experimental review of reinforcement learning algorithms for adaptive traffic signal control[M]//Autonomic Road Transport Support Systems, 2016: 47-66. |
14 | Prashanth L. Bhatnagar S. Reinforcement learning with average cost for adaptive control of traffic lights at intersections[C]//ITCS, 2011. |
15 | Van der Pol E, Oliehoek F. Coordinated deep reinforcement learners for traffic light control[C]//NeurIPS, 2016. |
16 | Xiong Y, Zheng G, Xu K, et al. Learning traffic signal control from demonstrations[C]//CIKM, 2019. |
17 | Zang X, Yao H, Zheng G, et al. Metalight: Value-based meta-reinforcement learning for traffic signal control[C]//AAAI, 2020. |
18 |
Brys T , Pham T , Taylor M . Distributed learning and multi-objectivity in traffic light control[J]. Connection Science, 2014, 26 (1): 65- 83.
doi: 10.1080/09540091.2014.885282 |
19 | Nishi T, Otaki K, Hayakawa K, et al. Traffic signal control based on reinforcement learning with graph convolutional neural nets[C]//ITSC, 2018. |
20 | Xu L , Xia X , Luo Q . The study of reinforcement learning for traffic self-adaptive control under multiagent markov game environment[J]. Mathematical Problems in Engineering, 2013, 8, 1- 10. |
21 | Casas N. Deep deterministic policy gradient for urban traffic light control[J]. 2017, arXiv: 1703.09035. |
22 | Wei H, Chen C, Zheng G, et al. Presslight: Learning max pressure control to coordinate traffic signals in arterial network[C]//SIGKDD, 2019. |
23 | Zhao C, Hu X, Wang G. PDLight: a deep reinforcement learning traffic light control algorithm with pressure and dynamic light duration[J]. 2020, arXiv: 2009.13711. |
24 | Wei H, Xu N, Zhang H, et al. Colight: Learning network-level cooperation for traffic signal control[C]//CIKM, 2019. |
25 |
Aslani M , Mesgari M , Wiering M . Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events[J]. Transportation Research Part C: Emerging Technologies, 2017, 85, 732- 752.
doi: 10.1016/j.trc.2017.09.020 |
26 | Chen C, Wei H, Xu N, et al. Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control[C]//AAAI, 2020. |
27 | Coşkun M, Baggag A, Chawla S. Deep reinforcement learning for traffic light optimization[C]//ICDM, 2018. |
28 | Gao J, Shen Y, Liu J, et al. Adaptive traffic signal control: Deep reinforcement learning algorithm with experience replay and target network[J]. 2017, arXiv: 1705.02755. |
29 | Wiering M. Multi-agent reinforcement learning for traffic light control[C]//ICML, 2000. |
30 | El-Tantawy S, Abdulhai B. An agent-based learning towards decentralized and coordinated traffic signal control[C]//ITSC, 2010. |
31 |
Arel I , Liu C , Urbanik T , et al. Reinforcement learning-based multi-agent system for network traffic signal control[J]. IET Intelligent Transport Systems, 2010, 4 (2): 128- 135.
doi: 10.1049/iet-its.2009.0070 |
32 | Watkins C , Dayan P . Q-learning[J]. Machine Learning, 1992, 8 (3): 279- 292. |
33 | Sutton R, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation[C]//NeurIPS, 1999. |
34 | Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms[J]. 2017, arXiv: 1707.06347. |
35 | Lillicrap T, Hunt J, Pritzel A, et al. Continuous control with deep reinforcement learning[J]. 2015, arXiv: 1509.02971. |
36 | Schulman J, Levine S, Abbeel P, et al. Trust region policy optimization[C]//ICML, 2015. |
37 | 唐建华. 强化学习及其在城市交通信号控制中的应用研究[D]. 西安: 西安电子科技大学, 2012. |
38 | Aslani M , Mesgari M , Seipel S , et al. Developing adaptive traffic signal control by actor-critic and direct exploration methods[J]. Transport, 2019, 172 (5): 289- 298. |
39 | Zhang Z, Yang J, Zha H. Integrating independent and centralized multi-agent reinforcement learning for traffic signal network optimization[C]//AAMAS, 2020. |
40 | Chen X, Xiong G, Lv Y, et al. A collaborative communication-Qmix approach for large-scale networked traffic signal control[C]//ITSC, 2021. |
41 | Sukhbaatar S, Szlam A, Fergus R. Learning multiagent communication with backpropagation[C]//NeurIPS, 2016. |
42 |
Xu M , Wu J , Huang L , et al. Network-wide traffic signal control based on the discovery of critical nodes and deep reinforcement learning[J]. Journal of Intelligent Transportation Systems, 2020, 24 (1): 1- 10.
doi: 10.1080/15472450.2018.1527694 |
43 | Ge H , Song Y , Wu C , et al. Cooperative deep q-learning with q-value transfer for multi-intersection signal control[J]. IEEE Access, 2019, 7, 40797- 40809. |
44 | Schlichtkrull M, Kipf T, Bloem P, et al. Modeling relational data with graph convolutional networks[C]//European Semantic Web Conference, 2018. |
45 | Wang Y , Xu T , Niu X , et al. STMARL: A spatio-temporal multi-agent reinforcement learning approach for cooperative traffic light control[J]. IEEE Transactions on Mobile Computing, 2020, 21 (6): 2228- 2242. |
46 | Huang X, Wu D, Jenkin M, et al. ModelLight: model-based meta-reinforcement learning for traffic signal control[J]. 2021, arXiv: 2111.08067. |
47 | Zhang H, Liu C, Zhang W, et al. Generalight: Improving environment generalization of traffic signal control via meta reinforcement learning[C]//CIKM, 2020. |
48 | Ault J, Hanna J, Sharon G. Learning an interpretable traffic signal control policy[C]//AAMAS, 2020. |
49 | Wei H, Chen C, Liu C, et al. Learning to simulate on sparse trajectory data[C]//ECML, 2021. |
50 | Zheng G, Liu H, Xu K, et al. Learning to simulate vehicle trajectories from demonstrations[C]//ICDE, 2020. |
51 | Zheng G, Liu C, Wei H, et al. Rebuilding city-wide traffic origin destination from road speed data[C]//ICDE, 2021. |
52 | Wu Q, Zhi P, Wei Y, et al. Communicate with traffic lights and vehicles based on multi-agent reinforcement learning[C]//CSCWD, 2021. |
53 | Capasso A, Maramotti P, Dell'Eva A, et al. End-to-End intersection handling using multi-agent deep reinforcement learning[C]//2021 IEEE Intelligent Vehicles Symposium, 2021. |
[1] | Wei XU, Yuefeng HUANG, Caihua CHEN. Charging and discharging scheduling for electric bus charging station with energy storage system [J]. Operations Research Transactions, 2023, 27(2): 95-109. |
[2] | Maoran WANG, Xingju CAI, Zhongming WU, Deren HAN. First-order splitting algorithm for multi-model traffic equilibrium problems [J]. Operations Research Transactions, 2023, 27(2): 63-78. |
[3] | Hu SHAO, Yue ZHUO, Pengjie LIU, Feng SHAO. Operational research methods for urban traffic flow estimation [J]. Operations Research Transactions, 2023, 27(2): 27-48. |
[4] | He WEI, Haofei LIU, Dandan XU, Xuehua HAN, Liang WANG, Xiaodong ZHANG. A systematic review of researches and applications of bi-level programming in the context of urban transport [J]. Operations Research Transactions, 2023, 27(2): 1-26. |
[5] | Minglu YE, Huan DENG. A new projection and contraction algorithm for solving quasimonotone variational inequalities [J]. Operations Research Transactions, 2023, 27(1): 127-137. |
[6] | Wenhui XIE, Chen LING, Chenjian PAN. A tensor completion method based on tensor train decomposition and its application in image restoration [J]. Operations Research Transactions, 2022, 26(3): 31-43. |
[7] | Binwu ZHANG, Xiucui GUAN. The bounded inverse optimal value problem on minimum spanning tree under unit infinity norm [J]. Operations Research Transactions, 2022, 26(3): 44-56. |
[8] | Bing SU, Wyatt CARLSON, Jiabin FAN, Arthur GAO, Yanjun SHAO, Guohui LIN. Sharing bicycle relocating with minimum carbon emission [J]. Operations Research Transactions, 2022, 26(3): 75-91. |
[9] | Ping ZHOU, Min JI, Yiwei JIANG. LPT heuristic for parallel-machine scheduling of maximizing total early work [J]. Operations Research Transactions, 2022, 26(3): 151-156. |
[10] | Jiahao LYU, Honglin LUO, Zehua YANG, Jianwen PENG. A stochastic Bregman ADMM with its application in training sparse structure SVMs [J]. Operations Research Transactions, 2022, 26(2): 16-30. |
[11] | Bo WANG, Li CHU, Liwei ZHANG, Hongwei ZHANG. An SAA approach for a class of second-order cone stochastic inverse quadratic programming problem [J]. Operations Research Transactions, 2022, 26(2): 31-44. |
[12] | Xiaoli HUANG, Yuelin GAO, Bo ZHANG, Xia LIU. An adaptive global optimization algorithm for solving quadratically constrained quadratic programming problems [J]. Operations Research Transactions, 2022, 26(2): 83-100. |
[13] | Xiquan SHAN, Meixia LI, Jinyu LIU. Smoothing Newton method for the tensor stochastic complementarity problem [J]. Operations Research Transactions, 2022, 26(2): 128-136. |
[14] | Libo WANG, Wenhua LI, Dan YU. Online scheduling on single batch machine with variable lookahead interval [J]. Operations Research Transactions, 2022, 26(1): 134-140. |
[15] | Jia HU, Tiande GUO, Congying HAN. Mini-batch stochastic block coordinate descent algorithm [J]. Operations Research Transactions, 2022, 26(1): 1-22. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||