Time-optimal control

The task of bringing the system from a given state to some given final state (either a single state or a set of states) can be formulated by setting L = 1, which turns the cost functional to J = \int_{t_\mathrm{i}}^{t_\mathrm{f}}1\text{d}t = t_\mathrm{f}-t_\mathrm{i}.

Time-optimal control for a linear system

We restrict ourselves to an LTI system \dot{\bm x} = \mathbf A\bm x + \mathbf B\bm u,\qquad t_\mathrm{i} = 0, \; \bm x(t_\mathrm{i}) = \mathbf x_0, for which we set the desired final state as \bm x(t_\mathrm{f}) = 0.

This only makes sense if we impose some bounds on the control. We assume the control to be bounded by |u_i(t)| \leq 1\quad \forall i, \; \forall t.

The necessary conditions can be assembled immediately by forming the Hamiltonian H = \boldsymbol\lambda^\top \,(\mathbf A\bm x+\mathbf B\bm u) - 1, and substituting into the (control) Hamilton canonical equations \begin{aligned} \dot{\bm x} &= \nabla_{\boldsymbol\lambda}H = \mathbf A\bm x + \mathbf B\bm u,\\ \dot{\boldsymbol \lambda} &= -\nabla_{\bm x}H = -\mathbf A^\top \boldsymbol \lambda. \end{aligned} plus the Pontryagin’s statement about maximization of H with respect to \bm u H(t, \bm x^\star ,\bm u^\star ,\boldsymbol\lambda^\star ) \geq H(t, \bm x^\star ,\bm u, \boldsymbol\lambda^\star ), \; u_i(t)\in [-1,1]\; \forall i, \; \forall t.

Application of Pontryagin’s principle gives (\boldsymbol\lambda^\star )^\top \, (\mathbf A\bm x^\star +\mathbf B\bm u^\star ) - 1 \geq (\boldsymbol\lambda^\star )^\top \, (\mathbf A\bm x^\star +\mathbf B\bm u) - 1,\quad u_i\in [-1,1]\; \forall i, \; \forall t.

Cancelling the identical terms on both sides we are left with (\boldsymbol\lambda^\star )^\top \, \mathbf B\bm u^\star \geq (\boldsymbol\lambda^\star )^\top \, \mathbf B\bm u,\quad u_i\in [-1,1]\; \forall i, \; \forall t.

It turns out that if this inequality is to hold then with the \bm u arbitrary on the left (within the bounds), the only way to guarantee the validity is to have \bm u^\star = \text{\textbf{sgn}}\left( (\boldsymbol\lambda^\star )^\top \, \mathbf B\right), where the signum function is applied elementwise. Clearly the optimal control is switching between the minimum and maximum values, which is 1 and -1. This is visualized in Fig. 1 for a scalar case (the \mathbf B matrix has only a single column).

Figure 1: Switching function and an optimal control derived from it.

Well, in fact to support this claim, it must be rigorously excluded that the argument of the signum function, the so-called switching function can assume zero value for longer then just a time instant (although repeatedly). Check this by yourself in (or its online version). Search for .

Time-optimal control for a double integrator system

Let us analyze the situation for a double integrator. This corresponds to a system described by the second Newton’s law. For a normalized mass the state space model is \begin{bmatrix} \dot y\\ \dot v \end{bmatrix} = \begin{bmatrix} 0 & 1\\ 0 & 0 \end{bmatrix} \begin{bmatrix} y\\ v \end{bmatrix} + \begin{bmatrix} 0\\1 \end{bmatrix} u.

The switching function is obviously \lambda_2(t) and an optimal control is given by u(t) = \text{sgn} \lambda_2(t).

We do not know \lambda_2(t). In order to get it, we may need to solve the costate equations. Indeed, we can solve them independently of the state equations since it is decoupled from them \begin{bmatrix} \dot \lambda_1\\ \dot \lambda_2 \end{bmatrix} = - \begin{bmatrix} 0 & 0\\ 1 & 0 \end{bmatrix} \begin{bmatrix} \lambda_1\\ \lambda_2 \end{bmatrix} , from which it follows that \lambda_1(t) = c_1 and \lambda_2(t) = c_1t+c_2. for some constants c_1 and c_2. To determine the constants, we will have to bring the boundary conditions finally into the game. The condition that H(t_\mathrm{f}) = 0 gives \lambda_2(t_\mathrm{f})u(t_\mathrm{f}) = 1.

We can now sketch possible profiles of the switching function. A few characteristic versions are in Fig. 2

Figure 2: Possible evolutions of the costate \lambda_2 in time-optimal control

What we have learnt is that the costate \lambda_2 would go through zero at most once during the whole control interval. Therefore we will have at most one switching of the control signal. This is a valuable observation.

We are approaching the final stage of the derivations. So far we have learnt that we can only consider u(t)=1 and u(t)=-1. The state equations can be easily integrated to get v(t) = v(0) + ut,\quad y(t) = y(0) + v(0)t + \frac{1}{2}ut^2.

To visualize this in y-v domain, express t from the first and subsitute into the second equation u(y-y(0)) = v(0) (v-v(0))+ \frac{1}{2}(v-v(0))^2, which is a family of parabolas parameterized by (y(0),v(0)). These are visualized in Fig. 3.

Show the code

using Plots
# Nastavení vykreslování
plot(legend=false, xlabel="y", ylabel="v")

# Parabolas for u = 1
u = 1
v0 = -3
v = -3:0.1:3

for y0 in -10:0.5:10
    y = y0 .+ v0 .* (v .- v0) .+ 1/2 .* (v .- v0).^2
    plot!(y, v, linewidth=0.5, color=:red)
end

v = -3:0.01:0
y = 1/2 .* v.^2
plot!(y, v, color=:red, linewidth=2)

annotate!(2.5, -1, text("u=1", 12, :red, halign=:right))

# Parabolas for u = -1
u = -1
v0 = 3
v = -3:0.1:3

for y0 in -10:0.5:10
    y = -y0 .- v0 .* (v .- v0) .- 1/2 .* (v .- v0).^2
    plot!(y, v, linewidth=0.5, color=:blue)
end

v = 0:0.01:3
y = -1/2 .* v.^2
plot!(y, v, color=:blue, linewidth=2)

annotate!(-1.0, 1, text("u=-1", 12, :blue, halign=:right))

Figure 3: Two families of (samples of) state trajectories corresponding to the minimum and maximum control for the double integrator system. Highlighted (thick) is the switching curve (composed of two branches).

There is a distinguished curve in the figure, which is composed of two branches. It is special in that for all the states starting on this curve, the system is brought to the origin for a corresponding setting of the control (and no further switching). This curve, called switching curve can be expressed as y = \left\{ \begin{array}{cl} \frac{1}{2}v^2 & \text{if} \; v<0\\ -\frac{1}{2}v^2 & \text{if} \; v>0 \end{array} \right. or y = - \frac{1}{2}v|v|.

The final step can be done refering to the figure. We point a finger anywhere in the state plane. We follow the state trajectory that emanates from that particular point for which we can get to the origin with at maximum 1 switching. Clearly the strategy is to set u such that it brings us to the switching curve (the thick one in the figure), switch the control and then just follow the trajectory till the origin. That is it. This control strategy can be written as \boxed{ u(t) = \left\{ \begin{array}{cl} -1 & \text{if } y(t)>-\frac{1}{2}v(t)|v(t)|\text{ or if } y(t)= -\frac{1}{2}v(t)|v(t)| \text{ and }y<0,\\ 1 & \text{if } y(t) < -\frac{1}{2}v(t)|v(t)|\text{ or if } y(t)= -\frac{1}{2}v(t)|v(t)| \text{ and }y>0. \end{array} \right.}

Show the code

using DifferentialEquations
using Plots
function time_optimal_bang_bang_control(x,umin,umax)
    y = x[1]                    # Position.
    v = x[2]                    # Velocity.
    s = y+1/2*v*abs(v)          # Switching curve s(y,v) = 0.
    if s > 0                    # If above the curve:
        u = umin
    elseif s < 0                # If below the curve:
        u = umax
    elseif s == 0 && v > 0      # If on the switching curve on the left:
        u = umin
    elseif s == 0 && v < 0      # If on the switching curve on the right:
        u = umax
    else u = 0                  # If at (0,0).
    end
    return u
end

function simulate_time_optimal_double_integrator()
    A = [0 1; 0 0]
    B = [0, 1]
    umin = -1.0
    umax = 1.0
    tfinal = 3.6
    tspan = (0.0,tfinal)
    x₀ = [1.0,1.0]
    f(x,p,t) = A*x + B*time_optimal_bang_bang_control(x,umin,umax)
    prob = ODEProblem(f,x₀,tspan)
    sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8)
    uopt = time_optimal_bang_bang_control.(sol.u,umin,umax);                         # Remember that the package uses `u` as the state variable.
    p1 = plot(sol,linewidth=2,xaxis="",yaxis="States",label=["x" "v"]) 
    p2 = plot(sol.t,uopt,linewidth=2,xaxis="Time",yaxis="Control",label="u")
    plot(p1,p2,layout=(2,1))
end

simulate_time_optimal_double_integrator()

Figure 4: Response of a double integrator with a time-optimal (bang-bang) feedback regulator to a nonzero initial state

In the plots you can find a confirmation of the fact that we derived rigorously—the fact that there will be at most one switch in the control signal… Ooops… This is actually not quite what we see in the plot above, is it? We can see that the control variable switches for the first time at about 2.2 s, but then it switches again at time close to 3.5 s. And it then keeps switching very fast. Consequently, the simulation gets significantly slower and the solver may even appear to get stuck.

Obviosly, what is going on is that the simulator is tempted to include not just two but in fact a huge number of switches in the control signal as it approaches the origin. This is quite characteristic of bang-bang control – a phenomenon called chattering. In this particular example we may decide to ignore it since both state variables are already close enough to the origin and we may want to declare the control task as finished. Generally, this chattering phenomenon needs to be handled somehow systematically. Any suggestion?