← 返回
SWAPP:面向智慧城市能源管理的动态动作边界调整群体精确策略优化
SWAPP: Swarm precision policy optimization with dynamic action bound adjustment for energy management in smart cities
| 作者 | Chia E.Tungo · Ben Niu · Hong Wang |
| 期刊 | Applied Energy |
| 出版日期 | 2025年1月 |
| 卷/期 | 第 377 卷 |
| 技术分类 | 储能系统技术 |
| 技术标签 | 储能系统 |
| 相关度评分 | ★★★★★ 5.0 / 5.0 |
| 关键词 | Designed a framework with a swarm decision making algorithm for energy management. |
语言:
中文摘要
摘要 能源存储系统在协调可再生能源发电与用电高峰、降低能耗成本以及减少碳排放方面正变得愈发重要。由于能源需求不断增长以及可再生能源集成复杂性的提高,智慧城市中的能源系统管理正面临日益严峻的挑战,这使得对储能系统的有效控制变得尤为关键。基于规则的控制器(RBCs)虽能提供预设解决方案,但缺乏适应能力。群体智能与进化算法则为决策与优化问题提供了具有成本效益、稳定且可扩展的解决途径。然而,这些方法在数据丰富场景下的有效性仍有待深入探索。本研究提出一个决策框架,并设计了一种名为SWAPP的决策算法,该算法利用能源使用数据,通过群体智能(SI)代理在城市能源系统中学习能源管理策略。SWAPP的核心组件包括:用于定制化决策的无监督K-Means分类器、通过延迟反馈促进战略性性能评估的奖励机制、结合认知学习与社会学习的两阶段策略学习方法,以及用于提升策略学习精度的动作边界调整机制。本文还引入了一种顺序批量采样方法,使群体训练能够在长时间跨度的决策问题中得以实现。经过群体优化的策略在配备光伏面板和电池储能、具有不同能源特征的建筑中进行了测试。性能评估涵盖电力成本、碳排放以及电网稳定性等方面。实验结果表明,SWAPP在标准化的CityLearn建筑能源管理环境中优于现代强化学习(RL)方法和基于规则的控制器(RBCs),展现出当前最优的性能表现。
English Abstract
Abstract Energy storage systems are increasingly essential for aligning renewable energy generation with consumption peaks, reducing costs, and lowering carbon emissions . Managing energy systems in smart cities is increasingly challenging due to rising energy demands and the complexities of integrating renewable sources, making effective control of storage systems crucial. Rule-based controllers (RBCs) offer predefined solutions but lack adaptability. Swarm and evolutionary algorithms provide cost-effective, stable, and scalable solutions for decision-making and optimization. However, their effectiveness in data-rich scenarios remains underexplored. This study presents a framework with a decision making algorithm named SWAPP which leverages energy usage data to learn an energy management policy in urban energy systems using swarm intelligence (SI) agents. The core components of SWAPP include an unsupervised K-Means classifier for tailored decision making, a reward mechanism that promotes strategic performance assessment through delayed feedback, a two-phase policy learning approach that combines cognitive and social learning for adaptive decision refinement with an action bound adjustment mechanism to enhance precision in policy learning. A sequential batch sampling approach is introduced which makes swarm training feasible for decision problems over long horizons. The swarm-optimized policies are tested in buildings with various energy profiles, equipped with PV panels and battery storage. Performance is evaluated based on electricity cost, carbon emissions, and grid stability. SWAPP outperforms modern RL and RBCs, demonstrating its state-of-the-art performance in the standardized CityLearn building energy management environment.
S
SunView 深度解读
该SWAPP群智能优化算法对阳光电源ST系列储能变流器和PowerTitan系统具有重要应用价值。其动态动作边界调整机制可优化储能系统充放电策略,提升电池寿命和经济性。K-Means分类器结合延迟奖励机制,能够适配不同建筑能源特征,为iSolarCloud平台的智能能量管理提供算法创新思路。相比传统规则控制,该方法在降低电费、减少碳排放和增强电网稳定性方面表现优异,可应用于光储一体化项目的EMS优化,特别适合城市微网和工商业储能场景的多目标协同控制。