Instead of controlling (regulating) the system to the origin (zero state), we may want to control it to some nonzero reference state \mathbf x^\mathbf{ref}\neq 0. The immediate idea might be that the cost function then changes to \boxed{
J(\ldots) = \frac{1}{2} \sum_{k=0}^{N} \left((\bm x_{k}-{\color{blue}\mathbf x^\mathrm{ref}})^\top \mathbf Q \, (\bm x_{k}-{\color{blue}\mathbf x^\mathrm{ref}}) + \bm u_{k}^\top \mathbf R \bm u_{k}\right).}
\tag{1}
While this is perfectly reasonable, it is also true that as soon as we extend the time horizon to infinity, that is, for the cost function
J(\ldots) = \frac{1}{2} \sum_{k=0}^{\infty} \left((\bm x_{k}-{\color{blue}\mathbf x^\mathrm{ref}})^\top \mathbf Q \, (\bm x_{k}-{\color{blue}\mathbf x^\mathrm{ref}}) + \bm u_{k}^\top \mathbf R \bm u_{k}\right).
then unless the nonzero reference state is an equilibrium, the control needed to keep the system at the corresponding state is nonzero, and the cost function will then be unbounded.
Namely, for a general discrete-time linear system with a control input, described by the state equation
\bm x_{k+1} = \mathbf A \bm x_k + \mathbf B \bm u_k,
\tag{2} at an equilibrium (a steady state) \mathbf x^\mathrm{ss}, the following must be satisfied
\bm x^\mathrm{ss} = \mathbf A \bm x^\mathrm{ss} + \mathbf B \bm u^\mathrm{ss},
which we can rearrange as
(\mathbf A - \mathbf I) \bm x^\mathrm{ss} + \mathbf B \bm u^\mathrm{ss} = \mathbf 0.
\tag{3}
Fixing the value of the steady state vector \bm x^\mathrm{ss} to \mathbf x^\mathrm{ref}, we can obtain the corresponding steady state control \bm u^\mathrm{ss} by solving
\mathbf B \bm u^\mathrm{ss} = (\mathbf I - \mathbf A) \mathbf x^\mathrm{ref},
from which we can express the solutions \bm u^\mathrm{ss} as
\bm u^\mathrm{ss} = \underbrace{\mathbf B^\dagger (\mathbf I - \mathbf A)}_{\mathbf N_\mathrm{u}} \mathbf x^\mathrm{ref},
\tag{4} where \mathbf B^\dagger is the pseudo-inverse of \mathbf B. It is also useful to notice that the steady state control \bm u^\mathrm{ss} is a linear function of the reference steady state \mathbf x^\mathrm{ref}, here represented by the matrix coefficient \mathbf N_\mathrm{u}.
It is worth emphasizing that unless the desired steady state \bm x^\mathrm{ref} is in the null space of (\mathbf A - \mathbf I), the corresponding steady-state control is nonzero. It then it makes no sense to penalize the control itself. Instead, the deviation of the control from the computed steady-state nonzero value can be penalized. The new cost function is then \boxed{
J(\ldots) = \frac{1}{2} \sum_{k=0}^{\infty} \left((\bm x_{k}-{\color{blue}\mathbf x^\mathrm{ref}})^\top \mathbf Q \, (\bm x_{k}-{\color{blue}\mathbf x^\mathrm{ref}}) + (\bm u_{k}-{\color{red}\bm u^\mathrm{ss}})^\top \mathbf R \bm (u_{k}-{\color{red}\bm u^\mathrm{ss}})\right).}
\tag{5}
Now, subtracting the steady state equation
\bm x^\mathrm{ref} = \mathbf A \bm x^\mathrm{ref} + \mathbf B \bm u^\mathrm{ss}
from Eq. 2, we get
\bm x_{k+1} - \bm x^\mathrm{ref} = \mathbf A (\bm x_k - \bm x^\mathrm{ref}) + \mathbf B (\bm u_k - \bm u^\mathrm{ss}).
Introducing the notation
\delta \bm x_k = \bm x_k - \mathbf x^\mathrm{ref}, \quad \delta \bm u_k = \bm u_k - \bm u^\mathrm{ss},
we can rewrite the above equation as
\delta \bm x_{k+1} = \mathbf A \delta \bm x_k + \mathbf B \delta \bm u_k,
which is the same as the original state equation, but with the state and control replaced by their deviations from the reference values. Accordingly, the cost function can be rewritten as
J(\ldots) = \frac{1}{2} \sum_{k=0}^{\infty} \left(\delta \bm x_{k}^\top \mathbf Q \, \delta \bm x_{k} + \delta \bm u_{k}^\top \mathbf R \delta \bm u_{k}\right).
For this we solve the standard LQR problem. When writing down the optimal control law, we must remember to add the steady state control \bm u^\mathrm{ss} back to the optimal control deviation \delta \bm u_k: \boxed{
\bm u_k = -\mathbf K (\bm x_k - \mathbf x^\mathrm{ref}) + \bm u^\mathrm{ss},}
where \mathbf K is the optimal gain matrix for the original problem. In other words, the optimal control law for the tracking problem is the same as the optimal control law for the regulation problem, but with the state and control replaced by their deviations from the reference values, and with an additional feedforward term \bm u^\mathrm{ss} that accounts for the nonzero reference state.
Recalling from Eq. 4 that \bm u^\mathrm{ss} is a linear function of \mathbf x^\mathrm{ref}, we can also visualize the structure of the controller as in Fig. 1.
Figure 1: Tracking a reference state over an infinite time horizon
Reference output tracking using LQR
If instead of the reference state (vector) variable, the output variable is to be tracked, we need to make a minor modification to the above scheme. First, the model of the plant must also contain the output equation. For convenience, we assume no direct feedthrough term :
\bm y_k = \mathbf C \bm x_k,
but it is straightforward to extend the scheme to the case with direct feedthrough. The cost function must be modified to penalize the deviation of the output from the reference output \mathbf y^\mathrm{ref} instead of the deviation of the state from the reference state \mathbf x^\mathrm{ref}:
J(\ldots) = \frac{1}{2} \sum_{k=0}^{\infty} \left((\bm y_{k}-\mathbf y^\mathrm{ref})^\top \mathbf Q_\mathrm{y} \, (\bm y_{k}-\mathbf y^\mathrm{ref}) + (\bm u_{k}-\bm u^\mathrm{ss})^\top \mathbf R \bm (u_{k}-\bm u^\mathrm{ss})\right)
The reference output \mathbf y^\mathrm{ref} and the reference state \bm x^\mathrm{ref} are related by the output equation at steady state
\mathbf y^\mathrm{ref} = \mathbf C \bm x^\mathrm{ref}.
\tag{6}
Combined with the previously provided equation for the steady state control \bm u^\mathrm{ss}, we can write the two conditions compactly as the following system of linear equations
\begin{bmatrix}\mathbf A - \mathbf I & \mathbf B \\
\mathbf C & \mathbf 0 \end{bmatrix} \begin{bmatrix}\bm x^\mathrm{ref} \\ \bm u^\mathrm{ss} \end{bmatrix} = \begin{bmatrix}\mathbf 0 \\ \mathbf y^\mathrm{ref} \end{bmatrix}.
From this, can write the cost function as
J(\ldots)= \frac{1}{2} \sum_{k=0}^{\infty} \left((\bm x_{k}-\bm x^\mathrm{ref})^\top \underbrace{\mathbf C^\top\mathbf Q_\mathrm{y} \mathbf C}_{\mathbf Q}\, (\bm x_{k}-\bm x^\mathrm{ref}) + (\bm u_{k}-\bm u^\mathrm{ss})^\top \mathbf R (\bm u_{k}-\bm u^\mathrm{ss})\right).
From the steady-state output equation Eq. 6, we can solve for the reference state \bm x^\mathrm{ref} as
\bm x^\mathrm{ref} = \mathbf C^\dagger \mathbf y^\mathrm{ref},
where \mathbf C^\dagger is the pseudo-inverse of \mathbf C. We denote \mathbf N_\mathrm{x} \coloneqq \mathbf C^\dagger. Now, we combine with the fact that the steady-state control is also linearly dependent on the reference
\bm u^\mathrm{ss} = \mathbf N_\mathrm{u} \mathbf y^\mathrm{ref}
and we can write compactly
\begin{bmatrix}\mathbf A - \mathbf I & \mathbf B \\
\mathbf C & \mathbf 0 \end{bmatrix}\begin{bmatrix}\mathbf N_\mathrm{x} \\ \mathbf N_\mathrm{u}\end{bmatrix} = \begin{bmatrix}\mathbf 0 \\ \mathbf I \end{bmatrix},
from which we can solve for the matrices \mathbf N_\mathrm{x} and \mathbf N_\mathrm{u}, and then using these matrices we can find the reference state \bm x^\mathrm{ref} and the steady-state control \bm u^\mathrm{ss}, both corresponding to the provided reference value of the output \mathbf y^\mathrm{ref}.
We can now visualize the overall control scheme using the block diagram in Fig. 2.
Figure 2: Tracking a reference output over an infinite time horizon
The control law is given by
\bm u_k = -\mathbf K (\bm x_k - \mathbf N_\mathrm{x}\mathbf y^\mathrm{ref}) + \mathbf N_\mathrm{u}\mathbf y^\mathrm{ref},
which can be expanded to
\bm u_k = -\mathbf K \bm x_k + \underbrace{(\mathbf K \mathbf N_\mathrm{x} + \mathbf N_\mathrm{u})}_{\mathbf N}\mathbf y^\mathrm{ref},
and this, in turn, can be represented using a block diagram in Fig. 3.
Figure 3: Alternative configuration for reference tracking
Before concluding, we should note that upon transforming the reference utput tracking problem into the reference state tracking problem, which in turn is transformed into the regulation problem, all on the infinite time horizon, we must not forget to check the detectability (or even better observability) of the system modelled by (\mathbf A, \sqrt{\mathbf C^\top\mathbf Q_\mathrm{y}\mathbf C)}. We illustrate this using the following example.
Example 1 (Reference output tracking on an infinite time-horizon) Two masses interconnected with a spring and a damper. The first mass is also connected to a fixed wall via a spring and a damper. Modelled by a fourth-order state-space model with one control input (force acting on the first mass) and one output (the position of the second mass).
Show the code
usingControlSystemsusingLinearAlgebrausingPlots# System parametersm₁ =1# Mass 1m₂ =0.1# Mass 2b₁₂ =0.03# Damping coefficient between the two massesk₁₂ =0.09# Spring stiffness between the two massesb₀₁ =0.3# Damping coefficient between the first mass and the wallk₀₁ =0.5# Spring stiffness between the first mass and the wall# State vector: x = [y'; y; d'; d], # where y is the position of the first mass, and d is the position of the second mass.A = [-(b₁₂+b₀₁)/m₁ -(k₁₂+k₀₁)/m₁ b₁₂/m₁ k₁₂/m₁;1000; b₁₂/m₂ k₁₂/m₂ -b₁₂/m₂ -k₁₂/m₂;0010]B = [1/m₁; 0; 0; 0] # The force (the control input) acts on the first mass.C = [0001] # We want to control the position of the second mass, d.D = [0]G =ss(A, B, C, D)h =0.1# Sampling timeGd =c2d(G, h)A, B, C, D =ssdata(Gd) # Rewriting the original matrices by those correspoinding to the discretized systemN = [A-I B; C 0]\[0; 0; 0; 0; 1]Nₓ = N[1:4]Nᵤ = N[5]# Setting up the cost functionQ_y =10.0Q = C'*Q_y*Crank(obsv(A,Q)) # Check the observability of the system with the new Q matrix. Should be 4.R =1.0# Solving the LQR problemK =lqr(Discrete, A, B, Q, R)# Introduce the reference into the state feedback schemey_ref(t) =1.0(t>=1.5) # Reference output (step function starting at t=1.5)N = K*Nₓ .+ Nᵤ # Dot needed here because K*Nₓ is a 1x1 matrix, while Nᵤ is a scaler.u(x,t) =-K*x .+N*y_ref(t) # Control law (u is a function of t and x)# Simulate the closed-loop systemt =0:h:15# Time vectorx0 = [0, 0.0, 0, 0.0] # Initial conditionres =lsim(G,u,t,x0=x0) # Simulation results
We explicitly show here that the system is observable with the new Q matrix:
rank(obsv(A,Q)) # The square root is not even needed
4
Finally, we plot the simulation results in Fig. 4.
Figure 4: Reference output tracking using LQR. Unit step in the reference value of the output occurs at time 1.5.
Reference output tracking using LQI
The acronym LQI stands for Linear Quadratic Integral control. The idea is to introduce an integral action to the LQR controller to eliminate steady-state errors in tracking problems. This is achieved by augmenting the state vector with an integral of the tracking error. Namely, we define a new component of the state vector will be updated according to
\bm x_{k+1}^\mathrm{I} = \bm x_{k}^\mathrm{I} + \underbrace{(\mathbf y^\mathrm{ref}-\mathbf C \bm x_k)}_{\bm e_k},
which can be combined with the original state equation to form an augmented state-space model
\begin{bmatrix}\bm x_{k+1} \\ \bm x_{k+1}^\mathrm{I}\end{bmatrix} = \underbrace{\begin{bmatrix}\mathbf A & \mathbf 0 \\ -\mathbf C & \mathbf I \end{bmatrix}}_{\mathbf A^\mathrm{aug}} \begin{bmatrix}\bm x_k \\ \bm x_k^\mathrm{I}\end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf B \\ \mathbf 0\end{bmatrix}}_{\mathbf B_\mathrm{u}^\mathrm{aug}} \bm u_k + \underbrace{\begin{bmatrix}\mathbf 0 \\ \mathbf I\end{bmatrix}}_{\mathbf B_\mathrm{y}^\mathrm{aug}} \mathbf y^\mathrm{ref}.
Figure 5: Tracking a reference output with integral action
The control law is given by \boxed{
\bm u_k = - \begin{bmatrix}\mathbf K & \mathbf K^\mathrm{I}\end{bmatrix} \begin{bmatrix}\bm x_k \\ \bm x_k^\mathrm{I}\end{bmatrix} + \mathbf K \mathbf N_\mathrm{x} \mathbf y^\mathrm{ref}.}
It is worth noting that the (discrete-time approximation of the) integrator serves the very much similar role as the feedforward term \bm u^\mathrm{ss} does in the scheme of Fig. 2 – it provides the constant offset to the control even if the regulation error is zero.
Example 2 (Reference output tracking using LQI) We consider the same system as in the previous example, but now we introduce an integral action to the controller.
Show the code
usingControlSystemsusingLinearAlgebrausingPlots# System parametersm₁ =1# Mass 1m₂ =0.1# Mass 2b₁₂ =0.03# Damping coefficient between the two massesk₁₂ =0.09# Spring stiffness between the two massesb₀₁ =0.3# Damping coefficient between the first mass and the wallk₀₁ =0.5# Spring stiffness between the first mass and the wall# State vector: x = [y'; y; d'; d], # where y is the position of the first mass, and d is the position of the second mass.A = [-(b₁₂+b₀₁)/m₁ -(k₁₂+k₀₁)/m₁ b₁₂/m₁ k₁₂/m₁;1000; b₁₂/m₂ k₁₂/m₂ -b₁₂/m₂ -k₁₂/m₂;0010]B = [1/m₁; 0; 0; 0] # The force (the control input) acts on the first mass.C = [0001] # We want to control the position of the second mass, d.D = [0]G =ss(A, B, C, D)h =0.1# Sampling timeGd =c2d(G, h)A, B, C, D =ssdata(Gd) # Rewriting the original matrices by those correspoinding to the discretized systemNₓ =pinv(C)# Building the augmented system for integral controlA_aug = [A zeros(4,1); -C 1]B_u_aug = [B; zeros(1,1)]B_y_aug = [zeros(4,1); 1] C_aug = [C zeros(1,1)]D_aug = [00] G_aug =ss(A_aug, [B_u_aug B_y_aug], C_aug, D_aug, h)#Weights for the LQR controllerQ_aug =1.0*Matrix(I, 5, 5)Q_aug[1,1] =1Q_aug[2,2] =1Q_aug[3,3] =1Q_aug[4,4] =1Q_aug[5,5] =0.02# Penalize the integral of the error as wellR =1.0# Design the LQR controller for the augmented systemK_aug =lqr(Discrete, A_aug, B_u_aug, Q_aug, R)K = K_aug[:, 1:4]Kᵢ = K_aug[:, 5:end] y_ref(t) = t >=1.5 ? 1.0:0.0# Reference output (step function starting at t=1.5) u(x_aug, t) = (-K*x_aug[1:4] - Kᵢ*x_aug[5] +K*Nₓ*y_ref(t))[1] # Without the [1] at the end, u would be a 1x1 matrix.uy(x_aug, t) = [u(x_aug, t), y_ref(t)] # The first column is the control input, and the second column is the output reference. This is needed for simulating the augmented system.# Simulate the closed-loop systemt =0:h:15x0 = [0, 0.0, 0, 0.0] # Initial values of the original (physical) state variables.xᵢ0= [0.0] # Initial value of the integral-of-the-tracking-error state variable. y, t, x, uout =lsim(G_aug, uy, t, x0=vcat(x0, xᵢ0))