多无人机系统安全感知协同决策优化算法
作者:
作者单位:

1.东南大学网络空间安全学院, 南京 211189;2.东南大学计算机科学与工程学院, 南京 211189

作者简介:

通讯作者:

基金项目:

国家自然科学基金(62576100)。


A Security-Aware Collaborative Decision Optimization Algorithm for Multi-UAV Systems
Author:
Affiliation:

1.School of Cyber Science and Engineering, Southeast University, Nanjing 211189, China;2.School of Computer Science and Engineering, Southeast University, Naning 211189, China

Fund Project:

National Natural Science Foundation of China (No.62576100).

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    多无人机系统在动态环境下的协同决策面临安全性与鲁棒性的双重挑战。传统方法将安全机制与决策算法分离设计,难以保障系统在异常情况下的可靠运行。本文提出一种遵循“安全左移”与“设计安全”思想的协同策略优化框架(Adaptive security control with adversarial-resilient endogenous strategy, ASC-ARES),通过状态建模与奖励塑形将拓扑控制、物理安全与能量管理等多层安全约束系统性嵌入深度强化学习(Deep reinforcement learning, DRL)决策过程,实现功能与安全的一体化设计。该框架首先扩展了深度确定性策略梯度(Deep deterministic policy gradient, DDPG)算法以适配混合动作空间,通过设计双头策略网络实现三维连续姿态与离散偏航角的联合优化。其次,框架融合了质心引导的双连通控制算法,赋予协同决策主动感知网络连通性的能力。最后,构建了以平均主观意见得分(Mean opinion score, MOS)为驱动的多目标自适应奖励机制,实现了用户体验质量(Quality of service, QoE)、网络双连通性、碰撞规避与能量效率的协同优化。 实验结果表明,ASC-ARES框架具备优异的收敛特性与稳定性,其MOS波动率控制在0.36%,双连通成功率高达99.98%。对抗攻击实验显示,系统在遭受快速梯度符号法(Fast gradient sign method, FGSM)、投影梯度下降(Projected gradient descent, PGD)法及强噪声干扰(?=2.0)后展现出优异的拓扑重构与状态恢复能力,干扰移除后的平均性能回升率超过80%。此外,消融实验证实了各安全组件的关键作用:拓扑控制模块将服务质量提升59%,排斥力机制则有效抑制了85%的碰撞风险。本研究为多无人机系统提供了一套平衡性能优化与内嵌安全保障的协同决策方案。

    Abstract:

    This paper addresses the dual challenge of security and robustness in collaborative decision-making for multi-UAV systems operating in dynamic and adversarial environments, where traditional approaches that decouple safety mechanisms from control policies often fail under anomalies. To this end, we propose adaptive security control with adversarial-resilient endogenous strategy (ASC-ARES), a novel framework grounded in “security by design” and “security left shift” principles that systematically embeds multi-layer constraints, including biconnected topology control, physical collision avoidance, and energy management, into deep reinforcement learning via structured state modeling and reward shaping. Methodologically, ASC-ARES extends the deep deterministic policy gradient (DDPG) algorithm to handle hybrid action spaces through a dual-head policy network for joint optimization of three-dimensional continuous attitude and discrete yaw actions. It further integrates a centroid-guided biconnectivity control algorithm to enable proactive network connectivity awareness and constructs a mean opinion score (MOS)- driven multi-objective adaptive reward mechanism to synergistically optimize quality of experience (QoE), network resilience, safety, and energy efficiency. Experimental results demonstrate that ASC-ARES achieves superior convergence and stability, maintaining an MOS fluctuation rate of only 0.36% and a biconnectivity success rate of 99.98%. Under fast gradient sign method (FGSM), projected gradient descent (PGD), and strong noise interference (?=2.0), the system exhibits exceptional topology reconstruction and state recovery capabilities, with an average performance restoration rate exceeding 80% after interference removal. Ablation studies confirm that the topology control module improves service quality by 59%, while the repulsion mechanism reduces collision risk by 85%. These findings establish ASC-ARES as an effective paradigm for achieving integrated performance-security co-optimization in resource-constrained multi-agent systems.

    参考文献
    相似文献
    引证文献
引用本文

李轶哲,谢晨宇,刘书鸣,万子恒,魏鑫锬,董璐.多无人机系统安全感知协同决策优化算法[J].数据采集与处理,2026,(1):66-88

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2025-11-15
  • 最后修改日期:2026-01-09
  • 录用日期:
  • 在线发布日期: 2026-02-13