← 返回
储能系统技术 储能系统 SiC器件 强化学习 ★ 5.0

知识增强的群体深度强化学习用于大规模电网实时网络约束经济调度

Knowledge-Augmented Population-Based Deep Reinforcement Learning for Real-Time Network-Constrained Economic Dispatch of Large-Scale Power Grid

作者 Yixi Chen · Jizhong Zhu · Hanjiang Dong · Cong Zeng · Zhenning Pan
期刊 IEEE Transactions on Power Systems
出版日期 2025年6月
技术分类 储能系统技术
技术标签 储能系统 SiC器件 强化学习
相关度评分 ★★★★★ 5.0 / 5.0
关键词 深度强化学习 实时网络约束经济调度 基于种群的深度强化学习 顺序安全投影技术 大规模电网
语言:

中文摘要

近年来,深度强化学习(DRL)因其在在线前瞻决策和应对不确定性方面的优势,被广泛应用于实时网络约束经济调度(NCED)。然而,传统DRL方法在计算效率与并行性方面存在局限,难以适应大规模电网环境。为此,本文提出一种新型知识增强的群体深度强化学习(PDRL)方法。PDRL通过扰动代理参数生成种群进行探索,并聚合个体结果构建代理梯度以更新模型,具有高效探索能力与高并行性。结合电网物理知识,提出序贯安全投影(S2P)技术,显著提升决策安全性并降低训练难度。在39节点、500节点和2383节点系统上的仿真表明,所提方法在解的最优性、决策安全性、计算性能及对大规模电网的适应性方面均优于当前最先进的DRL方法。

English Abstract

Deep reinforcement learning (DRL) for real-time network-constrained economic dispatch (NCED) has witnessed a surge of interest in recent years, for its advantages in online look-ahead decision-making and adaptability for uncertainties. However, regular DRL methods struggle to adapt to large-scale grid environments, due to their inherent limitations in computational efficiency and parallelizability. To fill these gaps, this paper proposes a novel knowledge-augmented population-based deep reinforcement learning (PDRL) method for large-scale real-time NCED problems. PDRL perturbs the agent's parameters to generate a population for exploration, and aggregates all individuals' results to construct a surrogate gradient for updates. Such a population-based parameter space exploration paradigm enables efficient exploration capability and high parallelizability. Moreover, leveraging the physical knowledge of the power grid, a novel sequential safe projection (S2P) technique was developed, which significantly enhances agent decision safety and alleviates training difficulty, allowing for better adaptation to the large-scale power grid. Numerical simulations on the 39-bus, 500-bus and 2383-bus systems demonstrate that compared with the state-of-the-art (SOTA) DRL methods, the proposed method shows better solution optimality, decision security, computational performance as well as the adaptability to large-scale power grids.
S

SunView 深度解读

该知识增强群体深度强化学习技术对阳光电源PowerTitan大型储能系统及iSolarCloud云平台具有重要应用价值。在储能侧,PDRL的高并行性与序贯安全投影技术可优化ST系列储能变流器的实时调度策略,确保大规模储能电站在电网约束下实现经济最优充放电决策,提升电网友好性。在光伏侧,该方法可集成至iSolarCloud平台的智能调度模块,协调分布式SG逆变器群的功率输出,应对新能源不确定性。相比传统优化算法,PDRL的在线前瞻决策能力可显著提升阳光电源构网型GFM储能系统的动态响应性能,为源网荷储协同控制提供毫秒级实时优化方案,强化公司在智慧能源管理领域的技术竞争力。