Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Reinforcement Learning Based Optimal Tracking Control UnderUnmeasurable Disturbances with Application to HVAC Systems



Syed Ali Asad Rizvi, Amanda Pertzborn, Zongli Lin


This work presents the design of an optimal disturbance rejecting tracking controller based on reinforcement learning. The problem involves finding the optimal control parameters that yield asymptotic output tracking in the presence of unmeasurable external disturbances when the system dynamics are unknown. Most existing reinforcement learning based disturbance rejection controllers require access to the disturbance channel for measurement and excitation purposes in order to prevent it from incurring bias in the learning estimates. In this paper we attempt to address this difficulty by a combination of a bias compensation mechanism and integral action. In particular, we present a new parametrization of the linear quadratic Q-function, which incorporates the unknown disturbance as an additional bias term. Both policy iteration and value iteration based Q-learning algorithms are presented to learn this Q-function. Disturbance rejection is achieved by augmenting the system dynamics with the integral of the tracking error. We show that, for unmeasurable disturbances that vary infrequently relative to the learning dynamics, the tracking error converges to zero asymptotically and the control parameters converge to the optimal ones without requiring knowledge of the system dynamics. An output feedback extension of the proposed approach is also presented, which relaxes the requirement of the measurement of the internal state. The feasibility of the design is validated on a practical optimal control application of a heating, ventilating, and air conditioning (HVAC) zone controller.
IEEE Transactions on Neural Networks


Q-learning, optimal tracking, HVAC control


Rizvi, S. , Pertzborn, A. and Lin, Z. (2021), Reinforcement Learning Based Optimal Tracking Control UnderUnmeasurable Disturbances with Application to HVAC Systems, IEEE Transactions on Neural Networks, [online],, (Accessed July 23, 2024)


If you have any questions about this publication or are having problems accessing it, please contact

Created June 18, 2021, Updated October 14, 2021