Optimization method of robot navigation state space dimension reduction control policy based on policy gradient learning

Gao, Yunfeng; Li, Jianan; Pan, Bingbing

doi:10.65102/is2026095

Research article

Ingegneria Sismica

Volume 43 Issue 1
Pages: 1
-23

Optimization method of robot navigation state space dimension reduction control policy based on policy gradient learning

Author(s): ^¹, ^¹, ^¹

¹Instrument and Electronics School of North University of China 030051, Taiyuan, China

Published: 30/04/2026

Cite

Gao, Yunfeng., Li, Jianan., and Pan, Bingbing. “Optimization method of robot navigation state space dimension reduction control policy based on policy gradient learning.” Ingegneria Sismica Volume 43 Issue 1: 1-23, doi:10.65102/is2026095.

https://doi.org/10.65102/is2026095

Abstract

In this paper, we propose a state space dimension reduction control policy optimization method based on policy gradient learning for continuous control modeling tasks in robot autonomous navigation in complex environments. In this method, the position, speed, course Angle, obstacle distance and target association information in the original navigation state are compressed and encoded to construct a low-dimensional state representation, which is jointly trained with the policy network to weaken the interference of redundant features on action search and enhance the stability and convergence efficiency of continuous control output. The experimental results on the simulation environment and the real robot platform show that compared with PPO, DDPG and the method without dimension reduction module, the navigation success rate of the proposed method reaches 96.7%, the average path length is reduced to 18.6 m, the decision delay is controlled at 0.041 s, and the training reward tends to be stable after 420 rounds. The real platform successfully reached the task target point in 19 out of 20 rounds of testing. Ablation experiments further show that the state space dimension reduction module has a significant support effect on the control smoothness, the control smoothness, the performance of complex scenes and the stability of dynamic obstacle avoidance, which can provide more stable strategy search boundaries and more efficient online deployment capabilities for robot navigation tasks.

Povzetek: Ta članek predlaga metodo optimizacije strategije nadzora z redukcijo dimenzionalnosti prostora stanj za robotsko navigacijo, ki temelji na učenju z gradientom politike. Eksperimenti kažejo, da stopnja uspešnosti navigacije te metode doseže 96,7 %, povprečna dolžina poti se zmanjša na 18,6 m, zakasnitev odločanja je omejena na 0,041 s, stabilna konvergenca pa je dosežena po 420 iteracijah.

Keywords
Policy gradient learning; Robot navigation; State space dimension reduction; Control strategy optimization

Research article
https://doi.org/10.65102/is20261300

Research on high-quality image super-resolution re...

Volume 43 Issue 3
Pages: 1
-21
08/07/2026

^¹,², ^¹,², ^¹

¹Hainan Vocational University of Science and Technology, Haikou 571126, China

²Institute for Mathematical Research, Universiti Putra Malaysia, Serdang 43400, Malaysia

Research article
https://doi.org/10.65102/is20261301

Multi-scale Dual Transformer based Multi long-term...

Volume 43 Issue 3
Pages: 1
-18
08/07/2026

^¹,², ^¹,², ^¹

¹Hainan Vocational University of Science and Technology, Haikou 571126, China

²Institute for Mathematical Research, Universiti Putra Malaysia, Serdang 43400, Malaysia

Research article
https://doi.org/10.65102/is20261299

Ultra-Short-Term Wind Power Forecasting Based on V...

Volume 43 Issue 3
Pages: 1
-15
08/07/2026

^¹, ^², ^¹, ^¹, ^¹

¹Electric Power Research Institute, State Grid Shanxi Electric Power Co., Ltd., Taiyuan, 030001, Shanxi, China

²Jincheng Power Supply Branch, State Grid Shanxi Electric Power Co., Ltd., Jincheng, 048000, Shanxi, China

Research article
https://doi.org/10.65102/is20261298

Integration of Traditional Culture Elements and Co...

Volume 43 Issue 3
Pages: 1
-12
01/07/2026

^¹,²

¹China Academy of Cultural Heritage, Chaoyang District, 100029, Beijing, China

²Beijing University of Civil Engineering and Architecture, Xicheng District, 100044, Beijing, China

Research article
https://doi.org/10.65102/is20261297

Quantitative Evaluation Model of the Policy Effect...

Volume 43 Issue 3
Pages: 1
-15
01/07/2026

^¹, ^², ^¹

¹School of Digital Media, Shenzhen Polytechnic University, Shenzhen 518055, Guangdong, China

²Postdoctoral Mobile Station of Journalism and communication, Fudan University, Shanghai 200433, Shanghai, China

Outline

Ingegneria Sismica

Optimization method of robot navigation state space dimension reduction control policy based on policy gradient learning

Abstract

Related Articles

Research on high-quality image super-resolution re...

Multi-scale Dual Transformer based Multi long-term...

Ultra-Short-Term Wind Power Forecasting Based on V...

Integration of Traditional Culture Elements and Co...

Quantitative Evaluation Model of the Policy Effect...

Open Access