A Graph-Based PPO Approach in Multi-UAV Navigation for Communication Coverage

Authors

  • Zhiling Jiang School of Aeronautics and Astronautics, Zhejiang University, China
  • Yining Chen School of Aeronautics and Astronautics, Zhejiang University, China
  • Ke Wang School of Aeronautics and Astronautics, Zhejiang University, China
  • Bowei Yang School of Aeronautics and Astronautics, Zhejiang University, China
  • Guanghua Song School of Aeronautics and Astronautics, Zhejiang University, China

DOI:

https://doi.org/10.15837/ijccc.2023.6.5505

Keywords:

UAV Swarm Intelligence, Communication Coverage, Graph Learning, Multi-Agent Reinforcement Learning

Abstract

Multi-Agent Reinforcement Learning (MARL) is widely used to solve various problems in real life. In the multi-agent reinforcement learning tasks, there are multiple agents in the environment, the existing Proximal Policy Optimization (PPO) algorithm can be applied to multi-agent reinforcement learning. However, it cannot deal with the communication problem between agents. In order to resolve this issue, we propose a Graph-based PPO algorithm, this approach can solve the communication problem between agents and it can enhance the exploration efficiency of agents in the environment and speed up the learning process. We apply our algorithms to the task of multi-UAV navigation for communication coverage to verify the functionality and performance of our proposed algorithms.

References

Brody, Shaked.; Alon, Uri.; Yahav, Eran. (2021). How attentive are graph attention networks?, arXiv preprint arXiv:2105.14491, 2021.

Canese, Lorenzo.; Cardarilli, Gian Carlo.; Di Nunzio, Luca.; Fazzolari, Rocco.; Giardino, Daniele.; Re, Marco.; Spanò, Sergio. (2021). Multi-agent reinforcement learning: A review of challenges and applications, Applied Sciences, 11(11), 4948, 2021.

https://doi.org/10.3390/app11114948

Gronauer, Sven.; Diepold, Klaus. (2022). Multi-agent deep reinforcement learning: a survey, Artificial Intelligence Review, 1-49, 2022.

Gupta, Lav.; Jain, Raj.; Vaszkun, Gabor. (2015). Survey of important issues in UAV communication networks, IEEE communications surveys & tutorials, 18(2), 2015.

https://doi.org/10.1109/COMST.2015.2495297

Haarnoja, Tuomas.; Zhou, Aurick.; Abbeel, Pieter.; Levine, Sergey. (2018). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor, International conference on machine learning, 1861-1870, 2018.

Hamilton, Will.; Ying, Zhitao.; Leskovec, Jure. (2017). Inductive representation learning on large graphs, Advances in neural information processing systems, 30, 2017.

Hart, Patrick.; Knoll, Alois. (2020). Graph neural networks and reinforcement learning for behavior generation in semantic environments, 2020 IEEE Intelligent Vehicles Symposium (IV), 1589-1594, 2020.

https://doi.org/10.1109/IV47402.2020.9304738

Jiang, Jiechuan.; Dun, Chen.; Huang, Tiejun.; Lu, Zongqing. (2018). Graph convolutional reinforcement learning, arXiv preprint arXiv:1810.09202, 2018.

Jiang, Zhiling.; Chen, Yining.; Song, Guanghua.; Yang, Bowei.; Jiang, Xiaohong. (2023). Cooperative planning of multi-UAV logistics delivery by multi-graph reinforcement learning, International Conference on Computer Application and Information Security (ICCAIS 2022), 12609, 129-137, 2023.

https://doi.org/10.1117/12.2671868

Kipf, Thomas N.; Welling, Max. (2016). Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907, 2016.

Mnih, Volodymyr.; Kavukcuoglu, Koray.; Silver, David.; Graves, Alex.; Antonoglou, Ioannis.; Wierstra, Daan.; Riedmiller, Martin. (2013). Playing atari with deep reinforcement learning, arXiv preprint arXiv:1312.5602, 2013.

Mnih, Volodymyr.; Kavukcuoglu, Koray.; Silver, David.; Rusu, Andrei A.; Veness, Joel.; Bellemare, Marc G.; Graves, Alex.; Riedmiller, Martin.; Fidjeland, Andreas K.; Ostrovski, Georg. (2015). Human-level control through deep reinforcement learning, nature, 518(7540), 529-533, 2015.

https://doi.org/10.1038/nature14236

Moradi, Mehrdad.; Sundaresan, Karthikeyan.; Chai, Eugene.; Rangarajan, Sampath.; Mao, Z Morley. (2018). SkyCore: Moving core to the edge for untethered and reliable UAV-based LTE networks, Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, 35-49, 2018.

https://doi.org/10.1145/3351422.3351431

Oroojlooy, Afshin.; Hajinezhad, Davood. (2022). A review of cooperative multi-agent deep reinforcement learning, Applied Intelligence, 1-46, 2022.

Pan, Wei.; Liu, Cheng. (2023). A Graph-Based Soft Actor Critic Approach in Multi-Agent Reinforcement Learning, INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 18(1), 2023.

https://doi.org/10.15837/ijccc.2023.1.5062

Ren, Yixiang.; Ye, Zhenhui.; Song, Guanghua.; Jiang, Xiaohong. (2022). Space-Air-Ground Integrated Mobile Crowdsensing for Partially Observable Data Collection by Multi-Scale Convolutional Graph Reinforcement Learning, Entropy, 24(5), 638, 2022.

https://doi.org/10.3390/e24050638

Ruan, Lang.; Wang, Jinlong.; Chen, Jin.; Xu, Yitao.; Yang, Yang.; Jiang, Han.; Zhang, Yuli.; Xu, Yuhua. (2018). Energy-efficient multi-UAV coverage deployment in UAV networks: A gametheoretic framework, China Communications, 15(10), 194-209, 2018.

https://doi.org/10.1109/CC.2018.8485481

Ryu, Heechang.; Shin, Hayong.; Park, Jinkyoo. (2020). Multi-agent actor-critic with hierarchical graph attention network, Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 7236-7243, 2020.

https://doi.org/10.1609/aaai.v34i05.6214

Schulman, John.; Wolski, Filip.; Dhariwal, Prafulla.; Radford, Alec.; Klimov, Oleg. (2017). Proximal policy optimization algorithms, arXiv preprint arXiv:1707.06347, 2017.

Veličković, Petar.; Cucurull, Guillem.; Casanova, Arantxa.; Romero, Adriana.; Lio, Pietro.; Bengio, Yoshua. (2017). Graph attention networks, arXiv preprint arXiv:1710.10903, 2017.

Wang, Enshu.; Liu, Bingyi.; Lin, Songrong.; Shen, Feng.; Bao, Tianyu.; Zhang, Jun.; Wang, Jianping.; Sadek, Adel W.; Qiao, Chunming. (2023). Double graph attention actor-critic framework for urban bus-pooling system, IEEE Transactions on Intelligent Transportation Systems, 2023.

https://doi.org/10.1109/TITS.2023.3238055

Wang, Yi.; Qiu, Dawei.; Wang, Yu.; Sun, Mingyang.; Strbac, Goran. (2023). Graph Learning- Based Voltage Regulation in Distribution Networks with Multi-Microgrids, IEEE Transactions on Power Systems, 2023.

https://doi.org/10.1109/TPWRS.2023.3242715

Watkins, Christopher JCH.; Dayan, Peter. (1992). Q-learning, Machine learning, 8, 279-292, 1992.

https://doi.org/10.1023/A:1022676722315

Xu, Xiaohan.; Zhang, Peng.; He, Yongquan.; Chao, Chengpeng.; Yan, Chaoyang. (2022). Subgraph neighboring relations infomax for inductive link prediction on knowledge graphs, arXiv preprint arXiv:2208.00850, 2022.

https://doi.org/10.24963/ijcai.2022/325

You, Jiaxuan.; Liu, Bowen.; Ying, Zhitao.; Pande, Vijay.; Leskovec, Jure. (2018). Graph convolutional policy network for goal-directed molecular graph generation, Advances in neural information processing systems, 31, 2018.

Additional Files

Published

2023-10-30

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.