強化学習によるデータドリブン非線形制御: カスケードタンクモデルへの適用

Published in 第69 回システム制御情報学会研究発表講演会, 2025

K. Sato, A.D. Carnerero

In this paper, a data-driven controller based on Proximal Policy Optimization (PPO) is proposed for the control of nonlinear systems. PPO is one of the most popular Reinforcement Learning algorithms, and it has many nice properties, including the ability to handle continuous action spaces and its on-policy training scheme, which contributes to stable and reliable policy updates. As the target, a cascade tank model composed of two tanks, an upper and a lower tank connected in series, is examined, and simulations are conducted to verify whether the controller designed through reinforcement learning can converge the water level to the target value with high accuracy. First, in experiments using a linear model, a comparison with the analytically determined Linear Quadratic Regulator (LQR) is carried out, demonstrating that the control performance achieved through learning is close to the theoretical optimum. Next, as a result of applying PPO to the nonlinear cascade tank model, it is confirmed that a controller that enables high-accuracy tracking of the target water level can be learned even in the presence of variations in initial conditions and disturbances. The results of this paper suggest that data-driven control effectively function even for nonlinear systems with noise.