上海大学学报(自然科学版) ›› 2024, Vol. 30 ›› Issue (3): 451-465.doi: 10.12066/j.issn.1007-2861.2569

• • 上一篇    下一篇

基于深度强化学习调控的非平稳风速模拟

曹黎媛, 张震雨, 李春祥   

  1. 上海大学 力学与工程科学学院, 上海 200444
  • 出版日期:2024-06-30 发布日期:2024-07-09
  • 通讯作者: 曹黎媛 (1991—) 女, 博士, 研究方向为结构振动控制、结构风工程 E-mail: caoly@shu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目 (52108460)

Non-stationary wind velocity simulation using deep reinforcement learning-based regulation and control

CAO Liyuan, ZHANG Zhenyu, LI Chunxiang   

  1. School of Mechanics and Engineering Sciences, Shanghai University, Shanghai 200444, China
  • Online:2024-06-30 Published:2024-07-09

摘要: 提出一种深度确定性策略梯度(deep deterministic policy gradient, DDPG) 算法和广义S 变换(generalized s transform, GST) 的新型混合模拟方法(DDPG-GST). 首先, 采用经验模态分解 (empirical mode decomposition, EMD) 技术将原始数据分解为非平稳脉动风速分量与趋势分量, 运用 GST 提取出非平稳脉动风速分量的时频特征, 构建广义 S 变换时频功率谱矩阵; 然后, 对矩阵进行 Cholesky 分解, 得到非平稳脉动风速模拟值; 接着, 将非平稳脉动风速模拟值载入DDPG 网络进行调控, 进而生成最优模拟值; 最后, 将非平稳脉动风速的模拟值与趋势分量叠加得到总风速时程模拟值. 结果表明: 与 GST 模拟方法相比, DDPG-GST方法的模拟值可以精准保留时域内非平稳脉动风速的能量特征, 由 DDPG-GST 得到的 GST系数幅值在时频域内的能量分布更接近目标值; 同时, DDPG-GST 方法的平均功率谱值更接近目标值. 基于深度强化学习调控的非平稳风速模拟是一种高精度数据驱动模拟方法.

关键词: 非平稳风速模拟, 深度强化学习, S 变换, 调控

Abstract: A novel hybrid simulation method for a deep deterministic policy gradient (DDPG) algorithm and generalized S-transform (GST), referred to as DDPG-GST, is pro-posed. In the DDPG-GST method, empirical mode decomposition is first used to decompose the original data into nonstationary fluctuating wind speed components and trend components. The GST is then used to extract the time–frequency characteristics of the nonstationary fluctuating wind speed components, followed by the construction of the GST time–frequency power spectrum matrix. Subsequently, Cholesky decomposition is applied to generate simulated nonstationary fluctuating wind speeds. These simulated speeds are input into the DDPG network for regulation and control to optimize the simulation pro-cess. Finally, the simulated total wind speeds are obtained by superposing the simulated nonstationary fluctuating wind speeds with the trend components. The results show that DDPG-GST retains the energy characteristics of nonstationary fluctuating wind speeds more accurately in the time domain compared to the GST simulation method. Additionally,the energy distributions, derived from the GST coefficient amplitudes by the DDPG-GST method in the time-frequency domain, align more closely with the targets. The average power spectrum of the DDPG-GST method is closer to the target. Therefore, the non-stationary wind speed simulation based on deep reinforcement learning is a high-precision, data-driven simulation method.

Key words: non-stationary wind speed simulation, deep reinforcement learning, S-transform, regulation and control

中图分类号: