Optimal Controls 2

Contents:

Problem statement
Fixed time and Fixed end point
- Solution using Calculus of Variations
- Solution using Dynamic programming
Fixed time and Free end point
- Solution using Calculus of Variations
- Solution using Dynamic programming
Free time and Fixed end point
- Solution using Calculus of Variations
- Solution using Dynamic programming
End point lying on a manifold
Moving end point
End point moving on a manifold

This post discusses different solutions obtained (open loop v/s closed loop) by applying different theories (calculus of variations, dynamic programming) for solving optimal control problems of a simple linear dynamical system.

Problem statement

Consider the minimum energy problem of a linear system (later we will see this example for a double integrator and harmonic oscillator).

\[J(u) = \frac{1}{2}\int_{t_0}^{t_f} u^T(t) u(t) dt \tag{1}\]

subject to:

\[\begin{align} \dot {\bf x}(t) &= A{\bf x}(t) + Bu(t) \\ {\bf x}(t_0) &= x_0 \\ {\bf x}(t_f) &= {\bf x_f} \tag{2} \end{align}\]

Now, let’s look at deriving the solution to the above problem for various initial and final conditions using two different approaches for each type.

Let’s start with the simplest case.

1. Fixed time and Fixed end point

In this problem, \(\delta t = 0\) and \(\delta {\bf x}_f = 0\)

Calculus of Variations solution

From the previous post, using Hamiltonian, we have:

\[H := p^T(t)f(t, x, u) - \mathcal{L} = p^T(t)(Ax+Bu)-\frac{1}{2}u^Tu\]

From \(H_x = -\dot p(t)\), we get;

\[\dot p(t) = -A^Tp(t)\]

and hence;

\[p(t) = e^{A^T(t_f - t)}p(t_f)\]

From \(H_u = 0\), we get;

\[u^*(t) = B^Tp(t)\]

and hence;

\[x^*(t) = Ax + BB^Tp(t)\]

Solving the system dynamics, gives;

\[\begin{align} x(t) &= e^{A(t-t_0)}x(t_0) + \int_{t_0}^{t_f} e^{A(t-\tau)}Bu(\tau) d\tau \\ &= e^{A(t-t_0)}x(t_0) + {\Big [}\int_{t_0}^{t_f} e^{A(t-\tau)}BB^Te^{A(t-\tau)} d\tau \Big{]} p(t_f) \end{align}\]

Using the boundary condition \(x(t_f) = x_f \), we can obtain the expression for p(t_f) as follows;

\[\begin{align} x(t_f) &= e^{A(t_f-t_0)}x(t_0) + {\Big [}\int_{t_0}^{t_f} e^{A(t_f-\tau)}BB^Te^{A(t_f-\tau)} d\tau \Big{]} p(t_f) \\ &= e^{A(t_f-t_0)}x(t_0) + W_{t_{f}}(A, B)p(t_f) \end{align}\]

where \(W_{t_{f}}\) is Controllability Gramian. Assuming that the system is controllable (and hence \(W_{t_{f}}\) is invertible),

\[\begin{align} p(t_f) &= W_{t_{f}}(A, B)^{-1}\left(x(t_f) - e^{A(t_f-t_0)}x(t_0)\right) \end{align}\]

Therefore, the optimal inputs is given by;

\[\begin{align} u^*(t) &= B^Tp(t) \\ &= B^Te^{A^T(t_f-t)} W_{t_{f}}(A, B)^{-1}x(t_f) \end{align}\]

As you can see, this is an open-loop solution which is not practical in deploying into real world.

Dynamic programming solution

From the previous post, we have;

\[-\frac{\partial V(x, t)}{\partial t} = \underset{u}{\text{Inf}}\Big\{\mathcal{L} + <\frac{\partial V(x, t)}{\partial x}, f(t, x)>\Big\}\]

Note that we cannot blindly apply the above HJB equation to the problem stated earlier. The reason will become quite clear shortly. Let’s consider a slightly modified problem as follows;

\[J(u) = \frac{1}{2}({\bf x_f} - {\bf x}(t_f))^TQ_f({\bf x_f} - {\bf x}(t_f)) + \frac{1}{2}\int_{0}^{T} u^T(t) u(t) dt\]

subject to:

\[\begin{align} \dot {\bf x}(t) &= A{\bf x}(t) + Bu(t) \\ {\bf x}(t_0) &= x_0 \\ {\bf x}(t_f) &= x_f \end{align}\]

where we have added in addition a terminal state cost.

2. Fixed time and Free end point

Intuitively thinking when do we encounter a situation of free end point type? Say, we start from some point \(x(t_0) = x_0\) and we want to reach \(x(t_f) = 0 \). This is a very hard constraint that which would not be accomplished exactly. But for the practical applications, if we are very close to origin, then the task would be complete. Instead of defining a terminal hard constraint as \(x(t_f) = 0\), we could add some high penalty for not being at origin in our cost. This is a soft constraint and is formulated as follows.

Let’s modify the objective function by adding terminal state cost so that HJB equations can be solved without shift of origin; \(J(u) = \frac{1}{2}{\bf x}^T_fQ_f{\bf x}_f + \frac{1}{2}\int_{0}^{T} u^T(t) u(t) dt\)

subject to:

\[\begin{align} \dot {\bf x}(t) &= A{\bf x}(t) + Bu(t) \\ {\bf x}(t_0) &= x_0 \\ \end{align}\]

where \(Q_f\) is a symmetric positive-definite matrix and we want to reach origin at \(t_f\). In this problem, \(\delta t = 0\) and \(\delta {\bf x}_f \neq 0\)

Calculus of Variations solution

Dynamic programming solution

3. Free time and Fixed end point

In this problem, \(\delta t \neq 0\) and \(\delta {\bf x}_f = 0\)

Calculus of Variations solution

Dynamic programming solution

4. End point lying on a manifold

In this problem, \(x(t_f) \in S\) where \(S(x_1, \dots x_n) = 0\) is a manifold.

5. Moving end point

In this problem, \(x(t_f) = \gamma (t_f)\) where \(\gamma(t)\) is a parameterized curve.

Optimal Controls 2

Problem statement

1. Fixed time and Fixed end point

Calculus of Variations solution

Dynamic programming solution

2. Fixed time and Free end point

Calculus of Variations solution

Dynamic programming solution

3. Free time and Fixed end point

Calculus of Variations solution

Dynamic programming solution

4. End point lying on a manifold

5. Moving end point

6. End point moving on a manifold