Multi-agent deep reinforcement learning-based urban traffic signal management

Yun HUA, Xiangfeng WANG, Bo JIN

doi:10.15960/j.cnki.issn.1007-6093.2023.02.003

Operations Research Transactions >

2023 , Vol. 27 >Issue 2: 49 - 62

DOI: https://doi.org/10.15960/j.cnki.issn.1007-6093.2023.02.003

Multi-agent deep reinforcement learning-based urban traffic signal management

Expand

1. School of Computer Science and Technology, East China Normal University, Shanghai 200062, China

Received date: 2022-06-12

Online published: 2023-06-13

Fold

Abstract

With the rapid improvement of the national economy in recent years, people's travel demand has increased, bringing increasingly severe pressure on the current urban traffic signal system relying on traditional non-intelligent traffic lights. The significant increase in the complexity of the traffic network has led to the development of traffic signal control from a single-point problem to a system engineering problem, and the development of artificial intelligence technology brings more methods to dealing with urban traffic signal control. Swarm intelligence methods, represented by multi-agent reinforcement learning, have been widely used in traffic signal control and optimization, including traffic light control, autonomous driving, and vehicle-road collaboration. Compared to traditional methods, multi-agent reinforcement learning can empower the intelligence of traffic signal systems while implementing large-scale traffic signal system collaboration to improve the efficiency of urban traffic operations. The various components involved in urban transportation must collaborate in the vision of intelligent urban traffic. Multi-agent reinforcement learning is of great research value in urban traffic signal control and optimization. This paper will systematically introduce the basic theory of multi-agent deep reinforcement learning and its use in urban traffic signal optimization, summarize the existing approaches and analyze the drawbacks of each method. In addition, this paper will outline the challenges of multi-agent reinforcement learning methods for urban traffic signal optimization. Then the paper points out possible future research directions to promote the development of multi-agent reinforcement learning methods in urban traffic signal optimization.

Key words： multi-agent reinforcement learning; intelligent traffic; traffic signal control; autonomous driving

Cite this article

Yun HUA, Xiangfeng WANG, Bo JIN . Multi-agent deep reinforcement learning-based urban traffic signal management[J]. Operations Research Transactions, 2023 , 27(2) : 49 -62 . DOI: 10.15960/j.cnki.issn.1007-6093.2023.02.003

References

1	周大可, 唐慕尧, 杨欣. 一种结合状态预测的深度强化学习交通信号控制方法[J]. 计算机应用研究, 2022, 39 (8): 2311- 2315.
2	Lin Y, Wang P, Ma M. Intelligent transportation system (ITS): Concept, challenge and opportunity[C]//International Conference on Big Data Security on Cloud, 2017.
3	Dion F , Rakha H , Kang Y . Comparison of delay estimates at under-saturated and over-saturated pre-timed signalized intersections[J]. Transportation Research Part B: Methodological, 2004, 38 (2): 99- 122.
4	Porche I , Lafortune S . Adaptive look-ahead optimization of traffic signals[J]. Journal of Intelligent Transportation System, 1999, 4 (3-4): 209- 254.
5	Wei H, Zheng G, Yao H, Li Z. IntelliLight: A reinforcement learning approach for intelligent traffic light control[C]//SIGKDD, 2018.
6	Li W, Chen H, Jin B, et al. Multi-agent path finding with prioritized communication learning[C]//ICRA, 2022.
7	Kober J , Bagnell J , Peters J . Reinforcement learning in robotics: A survey[J]. The International Journal of Robotics Research, 2013, 32 (11): 1238- 1274.
8	Lample G, Chaplot D. Playing FPS games with deep reinforcement learning[C]//AAAI, 2017.
9	Palanisamy P. Multi-agent connected autonomous driving using deep reinforcement learning[C]//IJCNN, 2020.
10	Abdoos M, Mozayani N, Bazzan A. Traffic light control in non-stationary environments based on multi agent Q-learning[C]//ITSC, 2011.
11	Balaji P , German X , Srinivasan D . Urban traffic signal control using reinforcement learning agents[J]. IET Intelligent Transport Systems, 2010, 4 (3): 177- 188.
12	Chu T , Wang J , Codecà L , et al. Multi-agent deep reinforcement learning for large-scale traffic signal control[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 21 (3): 1086- 1095.
13	Mannion P, Duggan J, Howley E. An experimental review of reinforcement learning algorithms for adaptive traffic signal control[M]//Autonomic Road Transport Support Systems, 2016: 47-66.
14	Prashanth L. Bhatnagar S. Reinforcement learning with average cost for adaptive control of traffic lights at intersections[C]//ITCS, 2011.
15	Van der Pol E, Oliehoek F. Coordinated deep reinforcement learners for traffic light control[C]//NeurIPS, 2016.
16	Xiong Y, Zheng G, Xu K, et al. Learning traffic signal control from demonstrations[C]//CIKM, 2019.
17	Zang X, Yao H, Zheng G, et al. Metalight: Value-based meta-reinforcement learning for traffic signal control[C]//AAAI, 2020.
18	Brys T , Pham T , Taylor M . Distributed learning and multi-objectivity in traffic light control[J]. Connection Science, 2014, 26 (1): 65- 83.
19	Nishi T, Otaki K, Hayakawa K, et al. Traffic signal control based on reinforcement learning with graph convolutional neural nets[C]//ITSC, 2018.
20	Xu L , Xia X , Luo Q . The study of reinforcement learning for traffic self-adaptive control under multiagent markov game environment[J]. Mathematical Problems in Engineering, 2013, 8, 1- 10.
21	Casas N. Deep deterministic policy gradient for urban traffic light control[J]. 2017, arXiv: 1703.09035.
22	Wei H, Chen C, Zheng G, et al. Presslight: Learning max pressure control to coordinate traffic signals in arterial network[C]//SIGKDD, 2019.
23	Zhao C, Hu X, Wang G. PDLight: a deep reinforcement learning traffic light control algorithm with pressure and dynamic light duration[J]. 2020, arXiv: 2009.13711.
24	Wei H, Xu N, Zhang H, et al. Colight: Learning network-level cooperation for traffic signal control[C]//CIKM, 2019.
25	Aslani M , Mesgari M , Wiering M . Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events[J]. Transportation Research Part C: Emerging Technologies, 2017, 85, 732- 752.
26	Chen C, Wei H, Xu N, et al. Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control[C]//AAAI, 2020.
27	Co?kun M, Baggag A, Chawla S. Deep reinforcement learning for traffic light optimization[C]//ICDM, 2018.
28	Gao J, Shen Y, Liu J, et al. Adaptive traffic signal control: Deep reinforcement learning algorithm with experience replay and target network[J]. 2017, arXiv: 1705.02755.
29	Wiering M. Multi-agent reinforcement learning for traffic light control[C]//ICML, 2000.
30	El-Tantawy S, Abdulhai B. An agent-based learning towards decentralized and coordinated traffic signal control[C]//ITSC, 2010.
31	Arel I , Liu C , Urbanik T , et al. Reinforcement learning-based multi-agent system for network traffic signal control[J]. IET Intelligent Transport Systems, 2010, 4 (2): 128- 135.
32	Watkins C , Dayan P . Q-learning[J]. Machine Learning, 1992, 8 (3): 279- 292.
33	Sutton R, McAllester D, Singh S, et al. Policy gradient methods for reinforcement learning with function approximation[C]//NeurIPS, 1999.
34	Schulman J, Wolski F, Dhariwal P, et al. Proximal policy optimization algorithms[J]. 2017, arXiv: 1707.06347.
35	Lillicrap T, Hunt J, Pritzel A, et al. Continuous control with deep reinforcement learning[J]. 2015, arXiv: 1509.02971.
36	Schulman J, Levine S, Abbeel P, et al. Trust region policy optimization[C]//ICML, 2015.
37	唐建华. 强化学习及其在城市交通信号控制中的应用研究[D]. 西安: 西安电子科技大学, 2012.
38	Aslani M , Mesgari M , Seipel S , et al. Developing adaptive traffic signal control by actor-critic and direct exploration methods[J]. Transport, 2019, 172 (5): 289- 298.
39	Zhang Z, Yang J, Zha H. Integrating independent and centralized multi-agent reinforcement learning for traffic signal network optimization[C]//AAMAS, 2020.
40	Chen X, Xiong G, Lv Y, et al. A collaborative communication-Qmix approach for large-scale networked traffic signal control[C]//ITSC, 2021.
41	Sukhbaatar S, Szlam A, Fergus R. Learning multiagent communication with backpropagation[C]//NeurIPS, 2016.
42	Xu M , Wu J , Huang L , et al. Network-wide traffic signal control based on the discovery of critical nodes and deep reinforcement learning[J]. Journal of Intelligent Transportation Systems, 2020, 24 (1): 1- 10.
43	Ge H , Song Y , Wu C , et al. Cooperative deep q-learning with q-value transfer for multi-intersection signal control[J]. IEEE Access, 2019, 7, 40797- 40809.
44	Schlichtkrull M, Kipf T, Bloem P, et al. Modeling relational data with graph convolutional networks[C]//European Semantic Web Conference, 2018.
45	Wang Y , Xu T , Niu X , et al. STMARL: A spatio-temporal multi-agent reinforcement learning approach for cooperative traffic light control[J]. IEEE Transactions on Mobile Computing, 2020, 21 (6): 2228- 2242.
46	Huang X, Wu D, Jenkin M, et al. ModelLight: model-based meta-reinforcement learning for traffic signal control[J]. 2021, arXiv: 2111.08067.
47	Zhang H, Liu C, Zhang W, et al. Generalight: Improving environment generalization of traffic signal control via meta reinforcement learning[C]//CIKM, 2020.
48	Ault J, Hanna J, Sharon G. Learning an interpretable traffic signal control policy[C]//AAMAS, 2020.
49	Wei H, Chen C, Liu C, et al. Learning to simulate on sparse trajectory data[C]//ECML, 2021.
50	Zheng G, Liu H, Xu K, et al. Learning to simulate vehicle trajectories from demonstrations[C]//ICDE, 2020.
51	Zheng G, Liu C, Wei H, et al. Rebuilding city-wide traffic origin destination from road speed data[C]//ICDE, 2021.
52	Wu Q, Zhi P, Wei Y, et al. Communicate with traffic lights and vehicles based on multi-agent reinforcement learning[C]//CSCWD, 2021.
53	Capasso A, Maramotti P, Dell'Eva A, et al. End-to-End intersection handling using multi-agent deep reinforcement learning[C]//2021 IEEE Intelligent Vehicles Symposium, 2021.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References