← 返回
基于对抗性模仿强化学习的混合储能电动汽车能量管理
Imitation reinforcement learning energy management for electric vehicles with hybrid energy storage system
| 作者 | Weirong Liu · Pengfei Yao · Yue Wu · Lijun Duan · Heng Li · Jun Peng |
| 期刊 | Applied Energy |
| 出版日期 | 2025年1月 |
| 卷/期 | 第 378 卷 |
| 技术分类 | 储能系统技术 |
| 技术标签 | 储能系统 强化学习 |
| 相关度评分 | ★★★★★ 5.0 / 5.0 |
| 关键词 | Adversarial imitation reinforcement learning is proposed for power allocation. |
语言:
中文摘要
深度强化学习已成为电动汽车能量管理的一种有前景的方法。然而,深度强化学习依赖大量试错训练才能获得近似最优性能。为此,本文提出一种面向混合储能系统的电动汽车对抗性模仿强化学习能量管理策略,以最小化电池容量损耗成本。首先,利用动态规划在多种标准驾驶条件下生成专家知识,用于引导强化学习的探索过程,该专家知识表示为最优功率分配映射。其次,在早期模仿阶段,通过对抗网络使强化学习智能体的动作快速逼近最优功率分配映射。再次,根据对抗网络中判别器的输出设计动态模仿权重,促使智能体在在线驾驶条件下逐步过渡到自主探索近似最优的功率分配策略。结果表明,与传统强化学习相比,所提策略可将训练速度提升42.60%,同时奖励值提高15.79%。在不同测试驾驶循环下,该方法进一步将电池容量损耗成本降低5.1%–12.4%。
English Abstract
Abstract Deep reinforcement learning has become a promising method for the energy management of electric vehicles. However, deep reinforcement learning relies on a large amount of trial-and-error training to acquire near-optimal performance. An adversarial imitation reinforcement learning energy management strategy is proposed for electric vehicles with hybrid energy storage system to minimize the cost of battery capacity loss. Firstly, the reinforcement learning exploration is guided by expert knowledge, which is generated by dynamic programming under various standard driving conditions. The expert knowledge is represented as the optimal power allocation mapping. Secondly, at the early imitation stage, the action of the reinforcement learning agent approaches the optimal power allocation mapping rapidly by using adversarial networks. Thirdly, a dynamic imitation weight is developed according to the Discriminator of adversarial networks, making the agent transit to self-explore the near-optimal power allocation under online driving conditions. Results demonstrate that the proposed strategy can accelerate the training by 42.60% while enhancing the reward by 15.79% compared with traditional reinforcement learning. Under different test driving cycles, the proposed method can further reduce the battery capacity loss cost by 5.1%–12.4%.
S
SunView 深度解读
该对抗模仿强化学习策略对阳光电源混合储能系统具有重要应用价值。可应用于ST系列PCS的电池-超级电容混合储能优化,通过专家知识引导的强化学习加速训练42.6%,降低电池容量损耗成本5.1%-12.4%。技术可集成至iSolarCloud平台实现在线工况自适应功率分配,延长PowerTitan储能系统电池寿命。同时适用于充电站储能协同控制,提升阳光电源储能产品全生命周期经济性与智能化运维能力。