张秀云, 王伟伦, 宗群, 刘达. 未知环境下无人机集群智能协同探索路径规划[J]. 工程科学学报. DOI: 10.13374/j.issn2095-9389.2023.10.15.002
引用本文: 张秀云, 王伟伦, 宗群, 刘达. 未知环境下无人机集群智能协同探索路径规划[J]. 工程科学学报. DOI: 10.13374/j.issn2095-9389.2023.10.15.002
Cooperative exploration path planning for UAV swarm in unknown environment[J]. Chinese Journal of Engineering. DOI: 10.13374/j.issn2095-9389.2023.10.15.002
Citation: Cooperative exploration path planning for UAV swarm in unknown environment[J]. Chinese Journal of Engineering. DOI: 10.13374/j.issn2095-9389.2023.10.15.002

未知环境下无人机集群智能协同探索路径规划

Cooperative exploration path planning for UAV swarm in unknown environment

  • 摘要: 随着无人机执行任务复杂性与环境种类多样性不断提高,多无人机集群系统逐渐得到国内外广泛关注,无人机路径规划成为当前研究热点。考虑到传统路径规划算法一般需要先验地图信息的问题,这在搜索救援等环境未知场景中难以满足,本文提出了一种基于强化学习的未知环境下无人机集群协同探索路径规划方法。首先,考虑无人机集群协同探索任务特点及动力学、避碰避障等约束条件,基于马尔可夫决策过程,建立无人机集群协同探索博弈模型与评价准则。其次,提出基于强化学习方法的无人机集群协同探索方法,建立基于策略-评判网络的双网络架构,并利用随机地图增强面对未知环境的泛化能力。每架无人机在探索过程中不断收集地图信息,并考虑环境与其余个体共享信息调整自身策略,通过迭代训练实现未知环境下的集群协同探索。最后,基于Unity搭建无人机集群协同探索虚拟仿真平台,并与非合作的单智能体算法进行对比试验,验证了本文所提算法在任务成功率、任务完成效率、回合奖励等方面均具有优势。

     

    Abstract: With the increasing complexity of task execution and the diverse range of environmental conditions, a single unmanned aerial vehicle (UAV) is insufficient to meet practical mission requirements. Multi-UAV systems have vast potential for applications in areas such as search and rescue. During the execution of search and rescue missions, UAVs acquire the location of the target to be rescued and subsequently plan a path that avoids obstacles and leads to the target point. Traditional path planning algorithms require prior knowledge of obstacle distribution in the map, which may be challenging to obtain in real-world missions. To address the issue of traditional path planning algorithms relying on prior map information, this paper proposes a reinforcement learning-based approach for collaborative exploration of multiple UAVs in unknown environment. Firstly, considering the characteristics of collaborative exploration tasks and various constraints of UAV clusters, a Markov decision process (MDP) is employed to establish a game model and task objectives for the UAV cluster. The UAVs need to satisfy dynamic and obstacle avoidance constraints during mission execution, with the objective of maximizing the search and rescue success rate. Secondly, a reinforcement learning-based method for collaborative exploration of multiple UAVs is proposed. The Multi-Agent Soft Actor-Critic (MASAC) algorithm is utilized to iteratively train the UAVs' collaborative exploration strategies. The Actor network generates UAV actions, while the Critic network evaluates the quality of these strategies. To enhance the algorithm's generalization capability, training is conducted in randomly generated map environments. To avoid UAVs being obstructed by concave obstacles, a breadth-first search algorithm is used to calculate rewards based on the path distance between the UAVs and targets, rather than geometric distance. During the exploration process, each UAV continuously collects the map information and shares it with all other UAVs. They make individual action decisions based on the environment and information from other agents, and the mission is considered successful when multiple UAVs hover above the target. Finally, a virtual simulation platform for algorithm validation is developed using the Unity game engine. The proposed algorithm is implemented using PyTorch, and the bidirectional interaction between the Unity environment and the Python algorithm is achieved through the ML-Agents framework. Comparative experiments are conducted on the virtual simulation platform, comparing the proposed method against a non-cooperative single-agent SAC algorithm. The proposed method demonstrates advantages in terms of task success rate, task completion efficiency, and episode rewards, validating the feasibility and effectiveness of the proposed approach.

     

/

返回文章
返回