← 返回
风电变流技术 储能系统 强化学习 ★ 5.0

基于鲁棒软演员-评论家算法的时空相关性风险调度方法

Risk-Based Dispatch of Power Systems Incorporating Spatiotemporal Correlation Based on the Robust Soft Actor-Critic Algorithm

作者 Jianbing Feng · Zhouyang Ren · Wenyuan Li
期刊 IEEE Transactions on Power Systems
出版日期 2024年11月
技术分类 风电变流技术
技术标签 储能系统 强化学习
相关度评分 ★★★★★ 5.0 / 5.0
关键词 安全深度强化学习 时空相关调度方法 鲁棒软演员 - 评论家算法 鲁棒约束马尔可夫决策过程 条件风险价值
语言:

中文摘要

基于安全深度强化学习(SDRL),本文提出一种考虑时空相关性的风险调度方法(SC-RD),同时建模违规风险的时间相关性与风电不确定性的空间相关性。为此设计了一种新型鲁棒软演员-评论家算法(R-SAC),无需近似或不确定性分布假设,即可在线求解非线性、非凸且含积分形式的SC-RD模型。通过构建鲁棒约束马尔可夫决策过程(R-CMDP),将违规风险作为智能体探索成本,并以成本的CVaR作为安全探索的风险指标。引入二阶中心矩评估模块高效估计CVaR,并结合加速原对偶优化实现最大熵自适应学习。在IEEE-39、IEEE-118及南卡500节点系统上的仿真验证了所提模型与方法的有效性。

English Abstract

Based on safe deep reinforcement learning (SDRL), this paper presents a risk-based dispatch method that incorporates spatiotemporal correlation (SC-RD). In the SC-RD model, both the temporal correlation of violation risks and the spatial correlation of wind power uncertainties are considered. A novel robust soft actor-critic (R-SAC) algorithm based on SDRL is presented to efficiently solve the SC-RD model. This algorithm enables online decision-making in coping with the nonlinearity, nonconvexity, and integral form of the SC-RD model without any approximations and uncertain distribution assumptions. In the R-SAC, a robust constrained Markov decision process (R-CMDP) for the SC-RD is established to address the critical bottleneck of SDRL in handling constraints. In the R-CMDP, the violation risks are treated as the exploratory cost of the agent. The CVaR of the cost is used as a risk indicator for safe exploration in the feasible region of the SC-RD. A second-order central moment evaluation module is presented to efficiently estimate the CVaR. The accelerated primal-dual optimization approach is integrated into the R-SAC to efficiently drive the R-CMDP for maximum entropy adaptive learning. The effectiveness of the proposed model and solution method is validated using modified IEEE-39, IEEE-118 and South Carolina 500-bus test systems.
S

SunView 深度解读

该研究提出的时空相关性风险调度方法对阳光电源的储能与风电产品具有重要应用价值。R-SAC算法可优化ST系列储能变流器的调度策略,提升PowerTitan大型储能系统在风电场景的运行稳定性。具体而言:(1)可应用于储能电站EMS的调度优化,提高储能容量配置合理性;(2)可集成到iSolarCloud平台,实现储能-风电协同的智能调度;(3)其CVaR风险评估方法有助于提升储能系统的经济性与安全性。该技术对完善阳光电源储能产品的智能调度算法具有创新启发,可提升大规模储能系统的运营效益。