← 返回
AdapSafe2:无先验安全认证的强化学习在多区域频率控制中的应用
AdapSafe2: Prior-Free Safe-Certified Reinforcement Learning for Multi-Area Frequency Control
| 作者 | Xu Wan · Mingyang Sun |
| 期刊 | IEEE Transactions on Power Systems |
| 出版日期 | 2024年10月 |
| 技术分类 | 储能系统技术 |
| 技术标签 | 储能系统 强化学习 |
| 相关度评分 | ★★★★★ 5.0 / 5.0 |
| 关键词 | 安全强化学习 电力系统频率控制 AdapSafe2方法 非平稳环境 安全约束 |
语言:
中文摘要
高比例可再生能源接入下,安全强化学习(RL)被广泛用于电力系统频率控制。然而,现有方法在非稳态环境适应与高维时变安全约束满足方面仍面临挑战。本文提出AdapSafe2,一种无需先验知识且具备安全保证的多区域频率控制方法。通过元环境学习算法自适应追踪系统参数变化,并构建元强化学习框架实现无模型自适应控制;设计基于控制屏障函数的安全评判网络与安全补偿器,动态识别并仅对高风险区域进行补偿,提升高维约束下的求解效率。在2区与3区低惯量系统中的仿真验证了该方法在动态安全约束下的优越性能。
English Abstract
Safe Reinforcement Learning (RL) has been widely investigated to conduct power systems frequency control under high renewable energy resources penetration. Nevertheless, existing safe RL-based frequency control methods still face two fundamental challenges to achieving safety guarantees: (1) operating in non-stationary environments without the prior knowledge of the system parameters and (2) simultaneously satisfying high-dimensional and time-varying safety constraints in multi-area cases. To this end, this paper proposes a prior-free reinforcement learning-based frequency control method with guaranteed safety for multi-area power systems named AdapSafe2. To tackle Challenge (1), a meta-based environmental learning algorithm is developed to automatically capture and rapidly adapts to non-stationary system parameters without relying on a predefined nominal model. Furthermore, a meta-RL framework is established to achieve a self-adaptive frequency control strategy without prior knowledge. Moreover, for Challenge (2), a novel safety-critic network and a safe-certified compensator based on the control barrier function are designed to identify time-varying safety constraints. Leveraging risk assessments from the safety-critic network, the compensator performs dynamic safety compensations only for areas with risk, thereby enhancing the efficiency of solving under high-dimensional safety constraints. Numerical simulations conducted under 2-Area and 3-Area wind-aggregated low-inertia power systems demonstrate that the proposed AdapSafe2 can outperform the state-of-the-art approaches while effectively satisfying the dynamic safety constraints.
S
SunView 深度解读
该无先验安全强化学习技术对阳光电源PowerTitan储能系统和ST系列储能变流器的频率调节功能具有重要应用价值。AdapSafe2的元学习自适应框架可增强储能系统在高比例新能源场景下的动态响应能力,其控制屏障函数安全机制能确保储能系统在SOC、功率等多维约束下的安全运行。该技术可与阳光电源现有的VSG虚拟同步机控制和GFM构网型控制技术融合,提升多区域电网互联场景下的一次调频性能。特别适用于iSolarCloud平台的智能调度模块,实现储能集群在非稳态电网环境下的无模型自适应协调控制,降低对精确系统模型的依赖,提高大规模储能参与电网调频的安全性与经济性。