面向多场景泛化的多阶段动态规划在线调度方法：基于通用价值函数学习的配电网调度

Scenario-Generalized Multi-Stage Dynamic Programming for Online Dispatch of Distribution Networks via Universal Value Function Learning

作者	Zhenning Pan · Yukun Deng · Tao Yu · Yufeng Wu · Junbin Chen · Yan Xu · Zhao Yang Dong
期刊	IEEE Transactions on Power Systems
出版日期	2025年8月
卷/期	第 41 卷第 1 期
技术分类	智能化与AI应用
技术标签	强化学习模型预测控制MPC 微电网并网逆变器
相关度评分	★★★★ 4.0 / 5.0
关键词

语言:

中文摘要

本文提出场景泛化的多阶段动态规划（S-MSDP）方法，通过学习映射场景上下文到价值函数的通用模型，实现配电网在线调度的零样本适应能力，无需重训练即可应对分布外不确定性，提升调度最优性、泛化性与可扩展性。

English Abstract

This paper studies the online dispatch of distribution networks (DNs), which is formulated as a multi-stage dynamic programming (MSDP) to ensure the non-anticipativity of dispatch decisions. Existing approaches usually relegate expensive online optimization to offline learning (typically value function learning) using the given uncertainty distribution or training samples. However, practical DNs may encounter various scenarios where the distributions of uncertainty differ significantly. The optimality of these approaches may degrade substantially in out-of-distribution scenarios unless frequent re-training is conducted. To address this obstacle, this paper proposes a scenario-generalized MSDP (S-MSDP) scheme for online dispatch of DNs. Its main advantage is the ability to directly adapt to new scenarios with high optimality, without re-training or fine-tuning. S-MSDP extends MSDP by learning a universal value function that maps scenario contexts to the corresponding value functions, so that the optimal dispatch policies under different scenarios can be directly inferred by using the learned universal value function. To facilitate the computation and storage burdens brought by large scenario space, a sparse and low-rank tensor approximation is introduced for universal value function learning. Numerical studies verify the optimality, generalization, and scalability of S-MSDP.

SunView 深度解读

该研究对阳光电源iSolarCloud智能运维平台及PowerTitan/ST系列储能PCS的实时协同调度具有重要价值。其通用价值函数框架可嵌入iSolarCloud的AI调度引擎，提升光储系统在负荷/新能源出力突变等未知场景下的自适应决策能力；建议在PowerTitan集群调度中集成S-MSDP算法模块，增强其在微电网、工商业光储项目中的鲁棒在线优化性能，降低对历史数据依赖，缩短部署周期。