基于麻雀搜索算法优化QLearning的移动机器人路径规划算法
DOI:
作者:
作者单位:

上海理工大学光电信息与计算机工程学院,上海 200093

作者简介:

通讯作者:

基金项目:


Path Planning Algorithm for Mobile Robots Optimized by Q-Learning Based on the Sparrow Search Algorithm
Author:
Affiliation:

School of Optical-Electrical and Computer Engineering ,University of Shanghai for Science and Technology , Shanghai 200093, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    针对动态未知环境中机器人路径规划存在收敛速度慢、参数敏感性强、计算效率低的问题,提出了一种结合麻雀搜索算法(Sparrow Search Algorithm,SSA)与Q学习(Quality-learning,Q-Learning)的SSA-Qlearning算法。该方法通过引入SSA中的发现者、跟随者和警戒者协同机制,优化了Q-Learning的学习率与衰减因子,并设计了动态权重调整策略以自适应搜索参数空间,消除传统Q-Learning分阶段优化中的偏差。算法通过引入一个环境动态因子量化环境动态性,实现了探索与安全性的动态平衡,同时保持了Q-Learning的轻量化特性,避免了双重深度Q网络(Double Deep Q-Network,DDQN)带来的高计算开销。实验结果表明,SSA-Qlearning在5×5、10×10、15×15动态栅格环境中路径成功率显著提升,训练时间分别仅为DDQN的8.07%、3.4%、3.03%,实现了接近DDQN性能的轻量化强化学习效果。

    Abstract:

    To address the issues of slow convergence, high parameter sensitivity, and low computational efficiency in robot path planning within dynamic unknown environments, a novel algorithm named SSA-Qlearning was proposed by integrating the Sparrow Search Algorithm (SSA) with Quality-learning(Q-Learning). The method optimized the learning rate and decay factor of Q-Learning by introducing the collaborative mechanism among discoverers, followers, and scouts in SSA, and designed a dynamic weight adjustment strategy to adaptively explore the parameter space, thus eliminating the bias in phase-based optimization of traditional Q-Learning. The algorithm quantifies environmental dynamics by introducing a dynamic environmental factor to achieve a dynamic balance between exploration and safety, maintained the lightweight characteristics of Q-Learning, and avoided the high computational cost of Double Deep Q-Network (DDQN). The experimental results indicate that SSA-Qlearning significantly improves the path success rate in 5×5, 10×10, and 15×15 dynamic grid environments, with training times being only 8.07%, 3.4%, and 3.03% of DDQN, respectively, achieving a lightweight reinforcement learning effect close to the performance of DDQN.

    参考文献
    相似文献
    引证文献
引用本文

许杨磊,王永雄.基于麻雀搜索算法优化QLearning的移动机器人路径规划算法[J].数据采集与处理,,():

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-12-23