改進DDPG的磁浮控制研究

張振利; 宋成林; 汪永壯; 楊杰

doi:10.13374/j.issn2095-9389.2025.06.21.001

摘要: 針對部分傳統磁浮控制算法依賴精確模型、適應性差的問題，提出一種基于強化學習的改進型深度確定性策略梯度(Improvement deep deterministic policy gradient, IDDPG)控制方法. 首先，搭建電磁懸浮系統數學模型并分析其動態特性. 其次，針對傳統DDPG算法在電磁懸浮控制中的不足，設計一種分段式反比例獎勵函數，以提升穩態精度和響應速度，并對DDPG控制流程進行分析及優化，以滿足實際部署需求. 最后，通過仿真與實驗，對比分析電流環跟蹤、獎勵函數、訓練步長以及模型變化對控制性能的影響. 結果表明：采用分段式反比例獎勵函數的IDDPG控制器在降低穩態誤差和超調的同時，顯著提升系統的響應速度，且優化后的控制流程適用于實際系統部署. 此外，不同模型下使用相同參數穩態誤差均低于5%，取得基本一致的控制效果，遠優于滑模控制(Sliding mode control, SMC)的31%和比例–積分–微分控制(Proportional–Integral–Derivative control, PID)的12%，驗證了IDDPG在不依賴精確模型情況下的良好適應性. 同時，抗擾實驗中，IDDPG相比PID超調減少51%，調節時間縮短49%，具有更強抗擾性.

Abstract:

This study proposes an improved deep deterministic policy gradient (IDDPG) controller for electromagnetic suspension systems to overcome the limitations of conventional maglev control strategies, particularly their dependence on precise mathematical models and challenges in real-world deployment. Leveraging reinforcement learning, the IDDPG approach achieves robust, model-free performance while meeting the stringent real-time requirements of magnetic suspension.

The system model is derived from electromagnetic force balance and Newtonian mechanics, yielding nonlinear coupled equations of coil current and air-gap displacement. These equations are linearized around the operating equilibrium to simplify controller design. Building on this foundation, the deep deterministic policy gradient (DDPG) algorithm is examined as a model-free actor–critic reinforcement learning method for continuous control. Recognizing its limitations in steady-state accuracy and transient response, we introduce a segmented inverse-proportional reward function that emphasizes small air-gap errors, accelerating convergence and improving response speed. To address hardware constraints, training is optimized by integrating network update latency and action–state delay into a unified control cycle, ensuring stable learning while reducing iteration time and execution delay on embedded platforms. The IDDPG controller is validated through simulations and hardware-in-the-loop experiments on a test rig replicating the suspension apparatus. Comparative studies with sliding mode control (SMC) and proportional–integral (PI) schemes demonstrate superior performance: steady-state error is reduced below 5% (vs. 31% with SMC and 12% with PI). Under parameter variations and disturbances, the controller maintains consistent performance with fixed hyperparameters, underscoring its robustness and generalization capability. Disturbance rejection tests further show that, compared to conventional PID control, IDDPG reduces overshoot by 51% and shortens adjustment time by 49%, yielding more stable levitation and lower mechanical stress. In summary, the IDDPG framework significantly improves control performance for electromagnetic suspension systems and expands the applicability of reinforcement learning in nonlinear control. By combining targeted reward function design, workflow optimization, and experimental validation, this work demonstrates a practical pathway toward deploying model-free, learning-based controllers in maglev and other precision suspension platforms.

改進DDPG的磁浮控制研究

Magnetic levitation control algorithm based on improved DDPG