Abstract:To address the issues of slow convergence, high parameter sensitivity, and low computational efficiency in robot path planning within dynamic unknown environments, a novel algorithm named SSA-Qlearning was proposed by integrating the Sparrow Search Algorithm (SSA) with Quality-learning(Q-Learning). The method optimized the learning rate and decay factor of Q-Learning by introducing the collaborative mechanism among discoverers, followers, and scouts in SSA, and designed a dynamic weight adjustment strategy to adaptively explore the parameter space, thus eliminating the bias in phase-based optimization of traditional Q-Learning. The algorithm quantifies environmental dynamics by introducing a dynamic environmental factor to achieve a dynamic balance between exploration and safety, maintained the lightweight characteristics of Q-Learning, and avoided the high computational cost of Double Deep Q-Network (DDQN). The experimental results indicate that SSA-Qlearning significantly improves the path success rate in 5×5, 10×10, and 15×15 dynamic grid environments, with training times being only 8.07%, 3.4%, and 3.03% of DDQN, respectively, achieving a lightweight reinforcement learning effect close to the performance of DDQN.