基于深度强化学习的考虑网络重构的多时间尺度电压/无功控制

Deep Reinforcement Learning Based Multi-Timescale Volt/Var Control in Distribution Networks Considering Network Reconfiguration

查看原文 · IEEE Xplore DOI: 10.1109/TSE.2025.11017695

作者	Hexiang Peng · Kai Liao · Jianwei Yang · Bo Pang · Zhengyou He
期刊	IEEE Transactions on Sustainable Energy
出版日期	2025年5月
技术分类	光伏发电技术
技术标签	储能系统深度学习强化学习
相关度评分	★★★★★ 5.0 / 5.0
关键词	多时间尺度电压无功控制双层数据驱动方法连续与离散设备控制强化学习算法拓扑变化鲁棒性

语言:

中文摘要

针对配电网中不同响应特性的设备带来的多时间尺度电压/无功控制（VVC）难题，本文提出一种新型双层数据驱动的多时间尺度VVC方法。该方法将光伏等连续型设备的短时间尺度控制与电容器组及网络重构等离散型设备的长时间尺度控制相协调，构建双层部分可观测马尔可夫决策过程（POMDP）模型。内层采用TD3算法控制连续变量，外层利用DDQN算法处理离散动作与网络重构。通过统一奖励信号并传递内层动作为外层状态实现协同训练，并引入图神经网络（GNN）识别代表性拓扑以缩减重构空间，抑制过度探索。在IEEE 33节点和PG&E 69节点系统上的仿真验证了该方法在VVC性能及对拓扑变化鲁棒性方面的优越性。

English Abstract

Coordinating Volt/Var control (VVC) across multiple timescales in distribution networks is challenging due to the diverse response characteristics of control devices. This paper proposes a novel bi-level data-driven multi-timescale VVC method to achieve coordinated control. The method integrates short-timescale control of continuous devices, such as photovoltaics, with longer-timescale control of discrete devices, including capacitor banks, and network reconfiguration. The VVC problem is formulated as a bi-level partially observable Markov decision process (POMDP). Inner-level control employs the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm for continuous devices, while outer-level control uses the Deep Double Q-Network (DDQN) algorithm for discrete devices and network reconfiguration. Collaborative training is achieved by aligning reward signals and providing inner-level agent actions as state information to outer-level agents. To mitigate over-exploration caused by network reconfiguration, graph neural networks (GNNs) are utilized to identify representative topologies, simplifying the reconfiguration space. The proposed method is validated on the IEEE 33-bus and PG&E 69-bus systems, demonstrating superior VVC performance and enhanced robustness to topological variations.

SunView 深度解读

该多时间尺度Volt/Var控制技术对阳光电源配电侧产品具有重要应用价值。在SG系列光伏逆变器中，可利用TD3算法实现连续无功功率的快速调节，优化现有MPPT算法与无功控制的协同；在ST系列储能变流器及PowerTitan大型储能系统中，可通过DDQN算法协调储能充放电与电容器组的离散控制决策。该方法的双层POMDP架构为阳光电源开发智能配电网协调控制器提供了框架，可集成到iSolarCloud云平台实现多设备协同优化。特别是GNN识别拓扑的思路，可增强产品对配电网重构场景的适应性，提升阳光电源在源网荷储一体化解决方案中的竞争力。