Finite-horizon LQR as a QP

Here we specialize the general procedure from the previous section to the case of a Linear system and a Quadratic cost. We start by considering a simple problem of Regulation, wherein the goal is to bring the system either exactly or approximately to zero final state, that is, \mathbf x^\text{ref}=\mathbf 0 and we want \bm x_N=\mathbf x^\text{ref} or \bm x_N\approx\mathbf x^\text{ref}, respectively. The problem is known as the LQR problem. \begin{aligned} \operatorname*{minimize}_{\bm u_0,\ldots, \bm u_{N-1}, \bm x_{0},\ldots, \bm x_N} &\quad \frac{1}{2} \bm x_N^\top \mathbf S \bm x_N + \frac{1}{2} \sum_{k=0}^{N-1} \left(\bm x_k^\top \mathbf Q \bm x_k + \bm u_k^\top \mathbf R \bm u_k \right)\\ \text{subject to} &\quad \bm x_{k+1} = \mathbf A\bm x_k + \mathbf B\bm u_k,\quad k = 0, \ldots, N-1, \\ &\quad \bm x_0 = \mathbf x_0,\\ &\quad \bm x_N = \mathbf 0\; (\text{or}\, \bm x_N \approx \mathbf 0). \end{aligned}

Referring to the two options for the last constraint,

if the condition \bm x_N=\mathbf 0 on the final state is strictly enforced, the terminal state cost (the term \frac{1}{2} \bm x_N^\top \mathbf S \bm x_N in the cost function) is redundant and can be removed;
if the final state condition can be relaxed to \bm x_N\approx\mathbf 0, it is by increasing the weight \mathbf S in the terminal cost function \frac{1}{2} \bm x_N^\top \mathbf S \bm x_N that \bm x_N can be made arbitrarily close to \mathbf 0.

Tip

It is a standard dilemma in optimization, not only in optimal control, that if we want to satisfy some requirement, we can either strictly enforce it through constraints or we can seemingly relax it and set a cost to be paid for not satysfying it.

Simultaneous (sparse) formulation

Below we rewrite the latter problem, that is, \bm x_N\approx\mathbf 0, in the “unrolled” form, where we stack the state and control variables into “long” vectors \bar{\bm x} and \bar{\bm u}. Doing the same for the former is straightforward. \begin{aligned} \operatorname*{minimize}_{\bar{\bm u},\bar{\bm x}} & \frac{1}{2}\left(\begin{bmatrix} \bm x_1^\top & \bm x_2^\top & \ldots & \bm x_N^\top \end{bmatrix} \underbrace{\begin{bmatrix}\mathbf Q & & & \\ & \mathbf Q & &\\ & &\ddots & \\ & & & \mathbf S \end{bmatrix}}_{\overline{\mathbf Q}} \underbrace{\begin{bmatrix} \bm x_1 \\ \bm x_2 \\ \vdots \\ \bm x_N \end{bmatrix}}_{\bar{\bm x}}\right.\\ &\qquad +\left. \begin{bmatrix} \bm u_0^\top & \bm u_1^\top & \ldots & \bm u_{N-1}^\top \end{bmatrix} \underbrace{\begin{bmatrix}\mathbf R & & & \\ & \mathbf R & &\\ & &\ddots & \\ & & & \mathbf R \end{bmatrix}}_{\overline{\mathbf R}} \underbrace{\begin{bmatrix} \bm u_0 \\ \bm u_1 \\ \vdots \\ \bm u_{N-1} \end{bmatrix}}_{\bar{\bm u}}\right) + \underbrace{\frac{1}{2}\mathbf x_0^\top \mathbf Q \mathbf x_0}_{\mathrm{constant}} \end{aligned} subject to \begin{bmatrix} \bm x_1 \\ \bm x_2 \\ \bm x_3\\ \vdots \\ \bm x_N \end{bmatrix} = \underbrace{\begin{bmatrix}\mathbf 0 & & & &\\\mathbf A & \mathbf 0 & & &\\ &\mathbf A &\mathbf 0 & & \\ & & &\ddots & \\& & &\mathbf A & \mathbf 0 \end{bmatrix}}_{\overline{\mathbf A}} \begin{bmatrix} \bm x_1 \\ \bm x_2 \\ \bm x_3\\ \vdots \\ \bm x_N \end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf B & & & & \\ & \mathbf B & & & \\& &\mathbf B & \\ & & &\ddots \\ & & & & \mathbf B \end{bmatrix}}_{\overline{\mathbf B}}\begin{bmatrix} \bm u_0 \\ \bm u_1 \\ \bm u_2\\\vdots \\ \bm u_{N-1} \end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf A\\\mathbf 0\\\mathbf 0\\\vdots\\\mathbf 0\end{bmatrix}}_{\overline{\mathbf A}_0}\mathbf x_0,

in which we have already substituted the particular \mathbf x_0 for the variable \bm x_0. Consequently, the last term in the cost function can be discarded because it is constant.

The terms with the \bar{\bm x} vector can be combined and we get \begin{bmatrix} \mathbf 0 \\ \mathbf 0 \\ \mathbf 0\\ \vdots \\ \mathbf 0 \end{bmatrix} = \underbrace{\begin{bmatrix}-\mathbf I & & & &\\\mathbf A & -\mathbf I & & &\\ &\mathbf A &-\mathbf I & & \\ & & &\ddots & \\& & &\mathbf A & -\mathbf I \end{bmatrix}}_{\overline{\mathbf A} - \mathbf I} \begin{bmatrix} \mathbf x_1 \\ \mathbf x_2 \\ \mathbf x_3\\ \vdots \\ \mathbf x_N \end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf B & & & & \\ & \mathbf B & & & \\& &\mathbf B & \\ & & &\ddots \\ & & & & \mathbf B \end{bmatrix}}_{\overline{\mathbf B}}\begin{bmatrix} \mathbf u_0 \\ \mathbf u_1 \\ \mathbf u_2\\\vdots \\ \mathbf u_{N-1} \end{bmatrix} + \underbrace{\begin{bmatrix}\mathbf A\\\mathbf 0\\\mathbf 0\\\vdots\\\mathbf 0\end{bmatrix}}_{\overline{\mathbf A}_0}\mathbf x_0. \tag{1}

Upon stacking the two “long” vectors into \bar{\bm z} we reformulate the optimization problem as \operatorname*{minimize}_{\widetilde{\mathbf z}\in\mathbb{R}^{2N}}\quad \frac{1}{2}\underbrace{\begin{bmatrix}\bar{\bm x}^\top &\bar{\bm u}^\top\end{bmatrix}}_{\bar{\bm z}^\top} \underbrace{\begin{bmatrix}\overline{\mathbf Q} & \\ & \overline{\mathbf R} \end{bmatrix}}_{\widetilde{\mathbf Q}}\underbrace{\begin{bmatrix}\bar{\bm x}\\\bar{\bm u}\end{bmatrix}}_{\bar{\bm z}} subject to \mathbf 0 = \underbrace{\begin{bmatrix}(\overline{\mathbf A}-\mathbf I) & \overline{\mathbf B}\end{bmatrix}}_{\widetilde{\mathbf A}}\underbrace{\begin{bmatrix}\bar{\bm x}\\\bar{\bm u}\end{bmatrix}}_{\bar{\bm z}} + \underbrace{\overline{\mathbf A}_0 \mathbf x_0}_{\tilde{\mathbf b}}.

To summarize, we have reformulated the optimal control problem as a linearly constrained quadratic program \boxed{ \begin{aligned} \underset{\bar{\bm z}\in\mathbb{R}^{2N}}{\text{minimize}} &\quad \frac{1}{2}\bar{\bm z}^\top \widetilde{\mathbf Q} \bar{\bm z}\\ \text{subject to} &\quad \widetilde{\mathbf A} \bar{\bm z} + \tilde{\bm b} = \mathbf 0. \end{aligned}}

This constrained optimization problem can still be solved without invoking a numerical solver for solving quadratic programs (QP). We do it by introducing a vector \boldsymbol\lambda of Lagrange multipliers to form the Lagrangian function \mathcal{L}(\bar{\bm z}, \boldsymbol \lambda) = \frac{1}{2}\bar{\bm z}^\top \widetilde{\mathbf Q} \bar{\bm z} + \boldsymbol\lambda^\top(\widetilde{\mathbf A} \bar{\bm z} + \tilde{\mathbf b}), for which the gradients with respect to \bar{\bm z} and \boldsymbol\lambda are \begin{aligned} \nabla_{\tilde{\bm{z}}} \mathcal{L}(\bar{\bm z}, \boldsymbol\lambda) &= \widetilde{\mathbf Q}\bar{\bm z} + \tilde{\mathbf A}^\top\boldsymbol\lambda,\\ \nabla_{\boldsymbol{\lambda}} \mathcal{L}(\tilde{\bm x}, \boldsymbol\lambda) &=\widetilde{\mathbf A} \bar{\bm z} + \tilde{\mathbf b}. \end{aligned}

Requiring that the overall gradient vanishes leads to the following KKT set of linear equations \begin{bmatrix} \widetilde{\mathbf Q} & \widetilde{\mathbf A}^\top\\ \widetilde{\mathbf A} & \mathbf 0 \end{bmatrix} \begin{bmatrix} \bar{\bm z}\\\boldsymbol\lambda \end{bmatrix} = \begin{bmatrix} \mathbf 0\\ -\tilde{\mathbf b} \end{bmatrix}.

Solving this could be accomplished by using some general solver for linear systems or by using some more tailored solver for symmetric indefinite systems (based on LDL factorization).

Example 1 (Reformulating the unconstrained LQR problem as a system of linear equations – simultaneous approach)

Show the code

using BlockArrays
using LinearAlgebra
using LinearSolve
using QDLDL
using Random
Random.seed!(1234)
using SparseArrays

function direct_dlqr_simultaneous(A,B,x₀,Q,R,S,N)
    n = size(A)[1]
    m = size(B)[2]
    Q̄ = BlockArray(spzeros(N*n,N*n),repeat([n],N),repeat([n],N))
    for i=1:(N-1)
        Q̄[Block(i,i)] = Q
    end
    Q̄[Block(N,N)] = S
    R̄ = BlockArray(spzeros(N*m,N*m),repeat([m],N),repeat([m],N))
    for i=1:N
        R̄[Block(i,i)] = R
    end
    Q̃ = blockdiag(sparse(Q̄),sparse(R̄))              # The matrix defining the quadratic cost.
    B̄ = BlockArray(spzeros(N*n,N*m),repeat([n],N),repeat([m],N))
    for i=1:N
        B̄[Block(i,i)] = B
    end
    Ā = BlockArray(sparse(-1.0*I,n*N,n*N),repeat([n],N),repeat([n],N))
    for i=2:N
        Ā[Block(i,(i-1))] = A
    end
    Ã = sparse([Ā B̄])                               # The matrix defining the linear (affine) equation.
    Ā₀ = spzeros(n*N,n)
    Ā₀[1:n,1:n] = A
    b̃ = Ā₀*sparse(x₀)                               # The constant offset for the linear (affine) equation.
    K = [Q̃ Ã'; Ã spzeros(size(Ã,1),size(Ã,1))]      # Sparse KKT matrix.
    k = [spzeros(size(Q̃,1)); -b̃]                    # Right hand side of the KKT system
    prob = LinearProblem(K,k)                       # The KKT system as a linear problem.
    z̃λ = LinearSolve.solve(prob)                    # Solving the KKT system. Ready for trying various solvers.
    xopt = reshape(z̃λ[1:(n*N)],(n,:))
    uopt = reshape(z̃λ[(n*N+1):(n+m)*N],(m,:))
    return xopt,uopt
end

n = 2               # Number of state variables.
m = 1               # Number of (control) input variables. 
A = rand(n,n)       # State matrix.
B = rand(n,m)       # Input coupling matrix.
x₀ = [1.0, 3.0]     # Initial state.

N = 10              # Time horizon.

s = [1.0, 2.0]      
q = [1.0, 2.0]
r = [1.0]

S = diagm(0=>s)     # Matrix defining the terminal state cost.
Q = diagm(0=>q)     # Matrix defining the running state dost.
R = diagm(0=>r)     # Matrix defining the cost of control.

xopt,uopt = direct_dlqr_simultaneous(A,B,x₀,Q,R,S,N)

using Plots
p1 = plot(0:(N-1),uopt',marker=:diamond,label="u",linetype=:steppost)
xlabel!("k")
ylabel!("u")

p2 = plot(0:N,hcat(x₀,xopt)',marker=:diamond,label=["x₁" "x₂"],linetype=:steppost)
xlabel!("k")
ylabel!("x")

plot(p1,p2,layout=(2,1))

Adding constraints on controls and states

When solving a realistic optimal control problem, we may want to impose inequality constraints on \bm u_k due to saturation of actuators. We may also want to add constraints on \bm x_k as well, which may reflect some performance specifications. In both cases, we would arrive at the full KKT conditions, and rather than trying to solve these, we resort to some finetuned numerical solver for quadratic programming (QP) instead.

Example 2 (Simultaneous approach to the LQR problem with constraints on control – higher-level implementation using optimization modelling language JuMP) While developing the model all the way down to the individual matrices and vectors gives an insight into the structure of the problem (we learnt that in absence of constraints it amounts to solving an indefinite system of linear equations), here we show how the use of an optimization modelling language can make the process of building the model a lot more convenient. We use JuMP for this purpose, but things would be similar with, say, cvxpy in Python or Yalmip in Matlab.

Show the code

using LinearAlgebra
using JuMP
using OSQP
using Random
Random.seed!(1234)

n = 2               # Number of state variables.
m = 1               # Number of (control) input variables. 
A = rand(n,n)       # State matrix.
B = rand(n,m)       # Input coupling matrix.
x₀ = [1.0, 3.0]     # Initial state.

N = 10              # Time horizon.

s = [1.0, 2.0]      
q = [1.0, 2.0]
r = [1.0]

S = diagm(0=>s)     # Matrix defining the terminal state cost.
Q = diagm(0=>q)     # Matrix defining the running state dost.
R = diagm(0=>r)     # Matrix defining the cost of control.

umin = -1.0
umax = 1.0

ocp = Model(OSQP.Optimizer)
set_silent(ocp)

@variable(ocp, umin <= u[1:N] <= umax)
@variable(ocp, x[1:n,1:N+1])

for i in 1:N
    @constraint(ocp, x[:,i+1] == A*x[:,i] + B*u[i])
end

fix(x[1,1], x₀[1])
fix(x[2,1], x₀[2])

@objective(ocp, Min, 1/2*dot(x[:,N],S,x[:,N]) + 1/2*sum(dot(x[:,i],Q,x[:,i]) + dot(u[i],R,u[i]) for i in 1:N-1))

optimize!(ocp)
uopt = value.(u)
xopt = value.(x)

using Plots

p1 = plot(0:(N-1),uopt,marker=:diamond,label="u",linetype=:steppost)
xlabel!("k")
ylabel!("u")

p2 = plot(0:N,xopt',marker=:diamond,label=["x₁" "x₂"],linetype=:steppost)
xlabel!("k")
ylabel!("x")

plot(p1,p2,layout=(2,1))

Sequential (dense) formulation

We can express \bar{\bm x} as a function of \bar{\bm u} and \mathbf x_0. This can be done in a straightforward way using (Eq. 1), namely, \bar{\bm x} = (\mathbf I-\overline{\mathbf A})^{-1}\overline{\mathbf B} \bm u + (\mathbf I-\overline{\mathbf A})^{-1} \overline{\mathbf A}_0 \mathbf x_0.

However, instead of solving the sets of equations, we can do this substitution in a more insightful way. Write down the state equation for several discrete times \begin{aligned} \bm x_1 &= \mathbf A\mathbf x_0 + \mathbf B\bm u_0\\ \bm x_2 &= \mathbf A\mathbf x_1 + \mathbf B\bm u_1\\ &= \mathbf A(\mathbf A\mathbf x_0 + \mathbf B\bm u_0)+ \mathbf B\bm u_1\\ &= \mathbf A^2\mathbf x_0 + \mathbf A\mathbf B\bm u_0 + \mathbf B\bm u_1\\ &\vdots\\ \bm x_k &= \mathbf A^k\mathbf x_0 + \mathbf A^{k-1}\mathbf B\bm u_0 +\mathbf A^{k-2}\mathbf B\bm u_1 +\ldots \mathbf B\bm u_{k-1}. \end{aligned}

Rewriting into matrix-vector form (and extending the time k up to the final time N) \begin{bmatrix} \bm x_1\\\bm x_2\\\vdots\\\bm x_N \end{bmatrix} = \underbrace{ \begin{bmatrix} \mathbf B & & & \\ \mathbf A\mathbf B & \mathbf B & & \\ \vdots & & \ddots &\\ \mathbf A^{N-1}\mathbf B & \mathbf A^{N-2}\mathbf B & & \mathbf B \end{bmatrix}}_{\widehat{\mathbf C}} \begin{bmatrix} \bm u_0\\\bm u_1\\\vdots\\\bm u_{N-1} \end{bmatrix} + \underbrace{ \begin{bmatrix} \mathbf A\\\mathbf A^2\\\vdots\\\mathbf A^N \end{bmatrix}}_{\widehat{\mathbf A}}\mathbf x_0.

For convenience, let’s rewrite the compact relation between \bar{\bm x} and \bar{\bm u} and \mathbf x_0 \bar{\bm x} = \widehat{\mathbf C} \bar{\bm u} + \widehat{\mathbf A} \mathbf x_0. \tag{2}

We can now substitute this into the original cost, which then becomes independent of \bar{\bm x}, which we reflect formally by using a new name \tilde J \begin{aligned} \tilde J(\bar{\bm u};\mathbf x_0) &= \frac{1}{2}(\widehat{\mathbf C} \bar{\bm u} + \widehat{\mathbf A} \mathbf x_0)^\top\overline{\mathbf Q} (\widehat{\mathbf C} \bar{\bm u} + \widehat{\mathbf A} \mathbf x_0) + \frac{1}{2}\bar{\bm u}^\top\overline{\mathbf R} \bar{\bm u} + \frac{1}{2}\mathbf x_0^\top\mathbf Q\mathbf x_0\\ &= \frac{1}{2}\bar{\bm u}^\top\widehat{\mathbf C}^\top \overline{\mathbf Q} \widehat{\mathbf C} \bar{\bm u} + \mathbf x_0^\top\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf C} \bar{\bm u} + \frac{1}{2} \mathbf x_0^\top\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf A} \mathbf x_0 + \frac{1}{2}\bar{\bm u}^\top\overline{\mathbf R} \bar{\bm u} + \frac{1}{2}\mathbf x_0^\top\mathbf Q\mathbf x_0\\ &= \frac{1}{2}\bar{\bm u}^\top(\widehat{\mathbf C}^\top \overline{\mathbf Q} \widehat{\mathbf C} + \overline{\mathbf R})\bar{\bm u} + \mathbf x_0^\top\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf C} \bar{\bm u} + \frac{1}{2} \mathbf x_0^\top(\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf A} + \mathbf Q)\mathbf x_0. \end{aligned}

The last term (the one independent of \bar{\bm u}) does not have an impact on the optimal \bar{\bm u} and therefore it can be discarded, but such minor modification perhaps does not justify a new name for the cost function. We write it as \tilde J(\bar{\bm u};\mathbf x_0) = \frac{1}{2}\bar{\bm u}^\top\underbrace{(\widehat{\mathbf C}^\top \overline{\mathbf Q} \widehat{\mathbf C} + \overline{\mathbf R})}_{\mathbf H}\bar{\bm u} + \mathbf x_0^\top\underbrace{\widehat{\mathbf A}^\top \overline{\mathbf Q} \widehat{\mathbf C}}_{\mathbf F^\top} \bar{\bm u}.

This cost is a function of \bar{\bm u}, the initial state \mathbf x_0 is regarded as a fixed parameter. Its gradient is \nabla \tilde J = \mathbf H\bar{\bm u}+\mathbf F\mathbf x_0.

Setting it to zero leads to the following linear system of equations \mathbf H\bar{\bm u}=-\mathbf F\mathbf x_0 that needs to be solved for \bar{\bm u}. Formally, we write the solution as \bar{\bm u} = -\mathbf H^{-1} \mathbf F \mathbf x_0.

Example 3 (Reformulating the unconstrained LQR problem as a system of linear equations – sequential approach)

Show the code

using Random
Random.seed!(1234)

function direct_dlqr_sequential(A,B,x₀,Q,R,S,N)
    n = size(A)[1]
    m = size(B)[2]
    Q̄ = BlockArray(spzeros(N*n,N*n),repeat([n],N),repeat([n],N))
    for i=1:(N-1)
        Q̄[Block(i,i)] = Q
    end
    Q̄[Block(N,N)] = S
    R̄ = BlockArray(spzeros(N*m,N*m),repeat([m],N),repeat([m],N))
    for i=1:N
        R̄[Block(i,i)] = R
    end
    Ĉ = BlockArray(spzeros(N*n,N*m),repeat([n],N),repeat([m],N))
    Â = BlockArray(spzeros(N*n,n),repeat([n],N),[n])
    for i=1:N
        for j = 1:i
            Ĉ[Block(i,j)] = A^(i-j)*B
            Â[Block(i,1)] = A^i
        end
    end
    H = Ĉ'*Q̄*Ĉ + R̄
    H = Array(H)
    F = Ĉ'*Q̄*Â
    F = Array(F)

    uopt = -H\(F*x₀)
    xopt = Ĉ*uopt + Â*x₀
    xopt = reshape(xopt,(2,:))
    return xopt,uopt
end

n = 2               # Number of state variables.
m = 1               # Number of (control) input variables. 
A = rand(n,n)       # State matrix.
B = rand(n,m)       # Input coupling matrix.
x₀ = [1.0, 3.0]     # Initial state.

N = 10              # Time horizon.

s = [1.0, 2.0]      
q = [1.0, 2.0]
r = [1.0]

S = diagm(0=>s)     # Matrix defining the terminal state cost.
Q = diagm(0=>q)     # Matrix defining the running state dost.
R = diagm(0=>r)     # Matrix defining the cost of control.

xopt,uopt = direct_dlqr_sequential(A,B,x₀,Q,R,S,N)

using Plots
p1 = plot(0:(N-1),uopt,marker=:diamond,label="u",linetype=:steppost)
xlabel!("k")
ylabel!("u")

p2 = plot(0:N,hcat(x₀,xopt)',marker=:diamond,label=["x₁" "x₂"],linetype=:steppost)
xlabel!("k")
ylabel!("x")

plot(p1,p2,layout=(2,1))

Adding the constraints on controls

Adding constraints on \bar{\bm u} is straightforward. It is just that instead of a linear system we will have a linear system with additional inequality constraints. Let’s get one \begin{aligned} \operatorname*{minimize}_{\bar{\bm u}} & \quad \frac{1}{2}\bar{\bm u}^T \mathbf H \bar{\bm u} + \mathbf x_0^T\mathbf F^T \bar{\bm u}\\ \text{subject to} &\quad \bar{\bm u} \leq \bar{\mathbf u}^\mathrm{max}\\ &\quad \bar{\bm u} \geq \bar{\mathbf u}^\mathrm{min}, \end{aligned} which we can rewrite more explicitly (in the matrix-vector format) as \begin{aligned} \operatorname*{minimize}_{\bar{\bm u}} & \quad \frac{1}{2}\bar{\bm u}^T \mathbf H \bar{\bm u} + \mathbf x_0^T\mathbf F^T \bar{\bm u}\\ \text{subject to} & \begin{bmatrix} \mathbf{I} & & & \\ & \mathbf{I} & & \\ & & \ddots & \\ & & & \mathbf{I} \\ -\mathbf{I} & & & \\ & -\mathbf{I} & & \\ & & \ddots & \\ & & & -\mathbf{I} \end{bmatrix} \begin{bmatrix} \bm u_0 \\ \bm u_1 \\ \vdots \\ \bm u_{N-1} \end{bmatrix} \leq \begin{bmatrix} \mathbf u^\mathrm{max} \\ \mathbf u^\mathrm{max} \\ \vdots \\ \mathbf u^\mathrm{max}\\ -\mathbf u^\mathrm{min} \\ -\mathbf u^\mathrm{min} \\ \vdots \\ -\mathbf u^\mathrm{min} \end{bmatrix}. \end{aligned}

Example 4 (Reformulating the LQR problem with constraints on control as a quadratic program – sequential approach)

Show the code

using LinearAlgebra
using BlockArrays
using SparseArrays
using JuMP
using OSQP
using Random
Random.seed!(1234)

function direct_dlqr_sequential(A,B,x₀,Q,R,S,N,(umin,umax))
    n = size(A)[1]
    m = size(B)[2]
    Q̄ = BlockArray(spzeros(N*n,N*n),repeat([n],N),repeat([n],N))
    for i=1:(N-1)
        Q̄[Block(i,i)] = Q
    end
    Q̄[Block(N,N)] = S
    R̄ = BlockArray(spzeros(N*m,N*m),repeat([m],N),repeat([m],N))
    for i=1:N
        R̄[Block(i,i)] = R
    end
    Ĉ = BlockArray(spzeros(N*n,N*m),repeat([n],N),repeat([m],N))
    Â = BlockArray(spzeros(N*n,n),repeat([n],N),[n])
    for i=1:N
        for j = 1:i
            Ĉ[Block(i,j)] = A^(i-j)*B
            Â[Block(i,1)] = A^i
        end
    end
    H = Ĉ'*Q̄*Ĉ + R̄
    H = Array(H)
    F = Ĉ'*Q̄*Â
    F = Array(F)
    prob = Model()
    @variable(prob, u[1:N*m])
    @objective(prob, Min, 1/2*dot(u,H,u) + dot(F*x₀,u))
    @constraint(prob, u .>= umin)
    @constraint(prob, u .<= umax)
    set_silent(prob)
    set_optimizer(prob, OSQP.Optimizer)
    optimize!(prob)
    uopt = value.(u)
    xopt = Ĉ*uopt + Â*x₀
    xopt = reshape(xopt,(2,:))
#=     u = Variable(N*m)
    problem = minimize(1/2*quadform(u,H) + dot(F*x₀,u))
    problem.constraints = [u >= umin, u <= umax]
    Convex.solve!(problem, SCS.Optimizer; silent = true)
    xopt = Ĉ*u.value + Â*x₀
    xopt = reshape(xopt,(2,:))
    uopt = u.value =#
    return xopt,uopt
end

n = 2               # Number of state variables.
m = 1               # Number of (control) input variables. 
A = rand(n,n)       # State matrix.
B = rand(n,m)       # Input coupling matrix.
x₀ = [1.0, 3.0]     # Initial state.

N = 10              # Time horizon.

s = [1.0, 2.0]      
q = [1.0, 2.0]
r = [1.0]

S = diagm(0=>s)     # Matrix defining the terminal state cost.
Q = diagm(0=>q)     # Matrix defining the running state dost.
R = diagm(0=>r)     # Matrix defining the cost of control.

umin = -1.0
umax = 1.0

xopt,uopt = direct_dlqr_sequential(A,B,x₀,Q,R,S,N,(umin,umax))

using Plots
p1 = plot(0:(N-1),uopt,marker=:diamond,label="u",linetype=:steppost)
xlabel!("k")
ylabel!("u")

p2 = plot(0:N,hcat(x₀,xopt)',marker=:diamond,label=["x₁" "x₂"],linetype=:steppost)
xlabel!("k")
ylabel!("x")

plot(p1,p2,layout=(2,1))

Adding the constraints on states

We might feel a little bit uneasy about loosing an immediate access to \bar{\bm x}. But the game is not lost. We just need to express \bar{\bm x} as a function of \bar{\bm u} and \mathbf x_0 and impose the constraint on the result. But such expression is already available, see (Eq. 2). Therefore, we can formulate the constraint, say, an upper bound on the state vector \bm x_k \leq \mathbf x_\mathrm{max} as \bar{\mathbf x}^\mathrm{min} \leq \widehat{\mathbf C} \bar{\bm u} + \widehat{\mathbf A} \mathbf x_0 \leq \bar{\mathbf x}^\mathrm{max}, where the the bars in \bar{\mathbf x}^\mathrm{min} and \bar{\mathbf x}^\mathrm{max} obviously indicates that these vectors were obtained by stacking the corresponding vectors for all times k=1,\ldots,N.