← 返回
基于鲁棒深度强化学习的不完全可观测配电网逆变器电压无功控制
Robust deep reinforcement learning for inverter-based volt-var control in partially observable distribution networks
| 作者 | Qiong Liua · Ye Guoa · Tong Xub |
| 期刊 | Applied Energy |
| 出版日期 | 2025年1月 |
| 卷/期 | 第 399 卷 |
| 技术分类 | 储能系统技术 |
| 技术标签 | 储能系统 强化学习 |
| 相关度评分 | ★★★★★ 5.0 / 5.0 |
| 关键词 | We propose a conservative critic to estimate the uncertainty of the state-action value function based on quantile regression technology. |
语言:
中文摘要
摘要 基于逆变器的电压无功控制在主动配电网(ADN)中调节电压和最小化功率损耗方面发挥着关键作用。然而,将深度强化学习(DRL)应用于该任务面临的一个主要挑战是ADN中量测设备部署有限,导致系统状态不完全可观测以及奖励信号未知的问题。为解决这些问题,本文提出了一种具有保守评论家(conservative critic)和代理奖励(surrogate reward)的鲁棒DRL方法。该保守评论家利用分位数回归技术,基于不完全可观测的状态估计出保守的状态-动作值函数,从而有助于训练出更具鲁棒性的策略;同时,设计了用于功率损耗和电压越限的代理奖励,使其能够根据有限的量测数据进行计算。所提出的方法在优化整个网络功率损耗和可测量节点电压水平的同时,间接改善了其他不可测节点的电压状况。大量仿真结果验证了该鲁棒DRL方法在不同有限量测条件下的有效性,即使仅有根节点的有功功率注入以及不足10%的节点电压可测量时,仍能保持良好性能。
English Abstract
Abstract Inverter-based Volt-Var control plays a vital role in regulating voltage and minimizing power loss in active distribution networks (ADNs). However, a key challenge in applying deep reinforcement learning (DRL) to this task lies in the limited measurement deployment of ADNs, which leads to problems of partially observable states and unknown rewards. To address these problems, this paper proposes a robust DRL approach with a conservative critic and a surrogate reward. The conservative critic utilizes the quantile regression technology to estimate a conservative state-action value function based on the partially observable state, which helps to train a robust policy; The surrogate rewards for power loss and voltage violation are designed such that they can be calculated from the limited measurements. The proposed approach optimizes the power loss of the whole network and the voltage profile of buses with measurable voltages while indirectly improving the voltage profile of other buses. Extensive simulations verify the effectiveness of the robust DRL approach under different limited measurement conditions, even when only the active power injection of the root bus and less than 10 % of bus voltages are measurable.
S
SunView 深度解读
该鲁棒深度强化学习方法对阳光电源ST系列储能变流器及SG系列光伏逆变器的Volt-Var控制具有重要应用价值。通过保守critic和代理奖励机制,可在配电网测点有限(仅10%节点可测)条件下实现电压调节和网损优化,契合实际工程部署约束。该技术可增强PowerTitan储能系统在部分可观测环境下的自适应控制能力,提升iSolarCloud平台的智能运维决策鲁棒性,为分布式光储系统的无模型优化控制提供创新思路,降低对全网通信和测量设备的依赖成本。