Skip to content

8 Optimal Control with Differential-Algebraic Equations

So far we have regarded optimal control problems based on model dynamics in their simplest explicit-ODE form:

x˙(t)=f(x(t),α(t)).

This form of model for dynamic systems tend to arise naturally from the first-principle modelling approaches standardly taught and used by engineers. As a result, most continuous dynamic systems are described via explicit ODEs. It is a common but less widespread knowledge that for a large number of applications, building a dynamic model in the form of explicit ODEs can be significantly more involved and yield dramatically more complex model equations than via alternative model forms. Before laying down some theory, let us start with a simple illustrative example that we will use throughout this chapter.

Example 8.1

Consider a mass m attached to a fixed point via a rigid link of length L for which one wants to develop a dynamic model. A classic modelling approach is to describe the mass via two angles (azimuth and elevation of the mass), which yields an explicit ODE. The alternative model construction we will consider here describes the system via the cartesian coordinates pR3 of the mass in a fixed, inertial reference frame E positioned at the attachment point of the mass, see Figure 8.1.

example1

Figure 8.1: Illustration of the example considered in this chapter. The system is described via the cartesian position of the mass pR3 in the fixed frame E. The mass is subject to the gravity force mgE3 and to a force zp from the rod, which ensures that the mass remains at a distance L from its attachment point. Here the scalar z is a variable in the dynamics that scales this force adequately.

The rod maintains the mass at a distance L of its attachment point by applying a force on the mass along its axis, i.e. having the support vector p. We will then describe the force of the rod as Frod =zp, where zR is a variable that adjusts the force magnitude to maintain the mass on a sphere of radius L, i.e. such that the condition ppL2=0 holds at all time. The model of the system can then takes a very simple form:

(8.1)mp¨=αzp+mgE3,12(ppL2)=0.

where E3=[001]. One can readily observe here that the model equation (8.1) is not a simple explicit ODE. Indeed, while the scalar variable z is intrinsically part of the model, its time derivative does not appear in the model equation. Hence, variable z is of a different nature than variable p. A variable that is intrinsic to the model equation (i.e. excluding possibly time-varying parameters and inputs) but that is not time differentiated in the model equation is labelled an algebraic state. A differential equation holding such variables is called a Differential Algebraic Equation (DAE).

Following up on this example, we will now provide a more formal view on the concept of Differential-Algebraic Equations.

8.1 What are DAEs ?

Let us consider a differential equation in a very generic form:

(8.2)f(x˙(t),x(t),α(t))=0.

Such a differential equation is labelled implicit as the state derivative x˙(t) is not provided via an explicit function of the state x(t) and input α(t), but implicitly as the solution of (8.2).

The Implicit Function Theorem guarantees that x˙(t) can be seen as a locally unique and continuously differentiable function of x(t) and α(t) if the Jacobian of f with respect to x˙(t), i.e. fx˙, is full rank. Under this condition, one is guaranteed that for a given state x(t) and input α(t), the state time derivative x˙(t) can be computed, either explicitly or numerically e.g. via a Newton iteration. Then (8.2) is an Ordinary Differential Equation, since x˙(t) can be computed at every time instant, and the model can in principle be treated as a classic explicit ODE.

Formally, Differential-Algebraic Equations are equations in the form (8.2) for which the above rank condition fails. Let us formalise that in the following definition.

Definition DAE

f(x˙(t),x(t),α(t))=0 is a DAE if fx˙ is rank deficient.

It is admittedly not straightforward to relate Definition above to the earlier example (8.1). Before making this relationship clear, let us illustrate Definition above on a simple example.

Example 8.2

Let us consider the following implicit differential equation, having the form (8.2):

(8.3)f(x˙,x,α)=[x1x˙1+1x˙1x2+2α]=0,

then the Jacobian of f with respect to the state derivatives x˙ reads as:

fx˙=[10x20]

and is rank-deficient, entailing that (8.3) is, by Definition above, a DAE.

Alternatively, one can also simply observe that x˙2 does not appear time-differentiated in (8.3), such that one can assess by simple inspection that it is a DAE.

In order to gain some further intuition in this example, consider solving the first equation in (8.3) for x˙1, giving

x˙1=x1+1

Upon inserting the expression for x˙1 in the second equation, one can then write (8.3) as

x˙1=x1+1,0=(x1+1)x2+2.

We observe here that the second equation is in fact purely algebraic, such that the model can be written as a mixture of an explicit differential equation and of an algebraic equation. This form of DAE is actually the most commonly used in practice. It is referred to as a semi-explicit DAE.

The above example can mislead one to believe that DAEs are fairly simple objects. To dispel that impression, let us provide a simple example of a DAE that possess fairly exotic properties.

Example 8.3

Let us consider the following differential equation

x˙1+x1α=0,(x1x2)x˙2+x1x2=0,

having the Jacobian

fx˙=[100x1x2]

which is rank-deficient for x1=x2. Hence for the initial conditions:

x1(0)=x2(0)

our equation is a DAE and its solution obeys:

x˙1=αx10=x2x1

otherwise it is an ODE. The fact that some differential equation can switch between being DAEs and ODEs betrays the fact that DAEs are not necessarily simple to handle and analyse. However, in the context of numerical optimal control, simple DAEs are typically favoured.

About DAE

As observed before, DAEs often simply arise from the fact that some states in the state vector x do not appear time-differentiated in the model equations, yielding a column of zeros in the Jacobian fx˙, as e.g. in example (8.2). In such a case, it is very useful to make an explicit distinction in the implicit differential equation (8.2) between the differential variables, i.e. the variables whose time derivative appear in f, typically labelled x, and the algebraic variables, i.e. the variables whose time derivative do not appear in f, typically labelled z. One can then rewrite (8.2) as:

(8.4)f(x˙,z,x,α)=0.

A DAE in the form (8.4) is called a fully-implicit DAE. The application of definition of DAE to (8.4) must then be understood in the sense that

(8.5)det(fx˙fz˙)=det(fx˙0)=0

is always rank deficient. The differential equation (8.4) is therefore always a DAE.

As mentioned in example 8.2, a common form of DAE often used in practice is the so-called semi-explicit form. It consists in explicitly splitting the DAE between an explicit differential equation and an implicit algebraic one. It reads as:

x˙=f(x,z,α),0=g(x,z,α).

The semi-explicit form is the most commonly used form of DAEs in optimal control. We turn next to a very important notion in the world of Differential-Algebraic Equations, both in theory and in practice.

8.2 Differential Index of DAEs

Before introducing the notion of differential index for DAE, it will be useful to take a brief and early tour into the problem of solving DAEs. Consider the semi-explicit DAE:

(8.6a)x˙=f(x,z,α),(8.6b)0=g(x,z,α),

and suppose that one can construct (possibly via a numerical algorithm such as a Newton iteration) a function ξ(x,α) such that:

g(x,ξ(x,α),α)=0,x,α

That is, function ξ(x,α) delivers the algebraic state z for any differential state x and input α. One can then proceed with eliminating the algebraic state z in (8.6), such that the DAE reads as:

(8.7a)x˙=f(x,ξ(x,α),α),(8.7b)z=ξ(x,α).

One can observe that (8.7a) is then an explicit ODE, and can therefore be handled via any classical numerical integration method. Moreover, (8.7b) provides the algebraic states explicitly. When such an elimination of the algebraic states is possible, one can consider the DAE (8.6) as "easy" to solve.

It is then natural to ask when such an elimination is possible. The Implicit Function Theorem (IFT) provides here a straightforward answer, namely the function ξ(x,α) exists (locally) if the Jacobian

(8.8)zg(x,z,α)

is full rank along the trajectories x,α,z of the system. The full-rankness of the Jacobian (8.8) additionally guarantees that the Newton iteration:

(8.9)zzg(x,z,α)z1g(x,z,α)

converges locally to the solution z of (8.6b). In that sense, (8.9) can be seen as a numerical procedure for constructing the implicit function ξ(x,α).

These notions easily extend to fully-implicit DAEs in the distinct form (8.4). More specifically, suppose that there exists two functions ξx˙(x,α) and ξz(x,α) that satisfy the fully implicit DAE (8.4), i.e.

f(ξx˙(x,α),ξz(x,α),x,α)=0,x,α

Then one can rewrite (8.4) as:

(8.10a)x˙=ξx˙(x,α)(8.10b)z=ξz(x,α).

Similarly to (8.7), one can treat (8.10a) as a simple ODE, while (8.10b) delivers the algebraic states z explicitly. The existence of functions ξx˙(x,α) and ξz(x,α) can then again be guaranteed by invoking the IFT, namely if for (8.4) the Jacobian matrix

(8.11)[fx˙fz]

is full rank, then functions ξx˙(x,α) and ξz(x,α) exist locally. The attentive reader will want to observe the important distinction between (8.5) which always hold for (8.4), and (8.11) whose full-rankness guarantees the local existence of the implicit functions ξx˙(x,α) and ξz(x,α).

Let us consider two examples to illustrate these notions.

Example 8.4

Consider again the fully-implicit DAE of Example 8.2, i.e.:

f(x˙,z,x,α)=[xx˙+1x˙z+2]=0

We observe that

[fx˙fz]=[10zx˙]

is full rank whenever x˙0, such that the implicit functions ξx˙(x,α) and ξz(x,α) are guaranteed to exist when x˙0. In this simple case, they can actually be computed explicitly. Indeed, we observe that:

x˙=ξx˙(x,α)=x+1,z=ξz(x,α)=2x+1

solve f(x˙,z,x,α) whenever x˙=x+10.

This simple example needs to be pitted against a more problematic one.

Example 8.5

Consider the fully-implicit DAE:

f(x˙,z,x,α)=[x˙1zx˙2x1x2α]=0

We observe that:

[fx˙fz]=[101010000]

is always rank-deficient, such that the differential state x˙ and algebraic state z cannot be uniquely obtained (even numerically) from solving f(x˙,z,x,α)=0 alone.

differential index

The topic of this section is the notion of differential index of DAEs. As we will see next, the loose idea of "easy" DAEs presented above is directly related to it. Let us introduce now the notion of differential index for DAEs.

Definition: differential index of a DAE

The differential index of a DAE is the number of times it must be time-differentiated before an explicit ODE is obtained.

For the specific case of a semi-explicit DAE, the above definition also reads as follows.

Definition: differential index of the semi-explicit DAE

The differential index of the semi-explicit DAE (8.6) is the number of times its algebraic part (8.6b) must be time-differentiated before an explicit ODE is obtained.

In order to clarify these definitions, let us make a simple example.

Example 8.5

Let us calculate the differential index of the DAE proposed in Example 8.2, i.e.:

f(x˙,z,x)=[xx˙+1x˙z+2]=0

We then consider the time derivative of f, i.e.:

(8.12)f˙(x¨,x˙,x,z˙,z)=[x˙x¨x¨z+x˙z˙]=0.

For the sake of clarity, we label v=[xzx˙] and rewrite (8.12) in the equivalent form:

ζ(v˙,v)=[v˙1v3v3v˙3v˙3v2+v3v˙2]=0

The Jacobian

ζ(v˙,v)v˙=[1000010v3v2]

is then full rank, such that ζ is an ODE for v according to Definition of DAE. Since a single time-differentiation has converted the original DAE of this example into an ODE, we can conclude that the original DAE is of index 1.

Let us contrast this example with a DAE having a higher differential index.

Example 8.6

Let us calculate the differential index of our illustrative example (8.1). Using v=p˙, and defining the differential state vector

x=[pv]

one can easily verify that the DAE (8.1) can be written as a semi-explicit DAE:

(8.13a)p˙=v,(8.13b)v˙=m1αm1zp+gE3,(8.13c)0=12(ppL2)=g(x,z,α).

We observe that for a given z, (8.13a)(8.13b) are already ODEs in v and p. As per Definition of differential index of semi-explicit DAE, we need to differentiate the algebraic equation (8.13c) until (8.13) becomes an ODE. The two first time derivatives read as:

g˙(x,z,α)=pv=0, and g¨(x˙,x,z,α)=pv˙+vv=0

One can then use (8.13b) in g¨(x˙,x,z,α) to obtain

g¨(x˙,x,z,α)=p(αm1zp+gE3)+vv=0

As z now appears explicitly in g¨(x˙,x,z,α), an extra time-differentiation yields a differential equation from which z˙ can be computed if pp0. We observe that 3 time-differentiations of (8.13c) were necessary to turn (8.13) into an ODE. It follows that (8.13) is an index-3 DAE.

index 1 DAE

Now we ought to relate the notion of "easy" DAEs to the notion of differential index. More specifically, we shall see next that index-1 DAEs are "easy" DAEs in the sense detailed previously. This observation can be formally described in the following Lemma.

Lemma

For any fully-implicit index-1 DAE

f(x˙,z,x,α)=0

there exists implicit functions ξx˙(x,α) and ξz(x,α) that satisfy:

f(ξx˙(x,α),ξz(x,α),x,α)=0,x,α.

Proof: We observe that if f is of index 1 , then a single time-differentiation:

f˙=fx˙x¨+fzz˙+fxx˙+fαα˙=0

yields a pure ODE. For the sake of clarity, we label v=[x˙z] and write:

(8.14)f˙=[fx˙fz]v˙+fxx˙+fαα˙=0.

By assumption, (8.14) can be written as an explicit ODE, hence [fx˙fz] must be full rank, such that:

v˙=[fx˙fz]1(fxx˙+fαα˙)

holds on the DAE trajectories. The IFT then guarantees the existence of the implicit functions ξx˙(x,α) and ξz(x,α) in a neighborhood of the trajectories of the DAE.

A similar result can clearly be established for any index-1 semi-explicit DAEs on the existence of an implicit function ξz(x,α) that solves the algebraic equation, i.e. such that

g(x,ξz(x,α),α)=0,x,α

The crucial practical consequence of these observations is that index-1 DAEs can be in principle solved numerically (or sometimes even explicitly) without difficulties, as for any state and input x(t) and α(t), the state derivative x˙(t) and algebraic state z(t) can be computed, and the simulation of the dynamics performed. In practice, implicit integration methods are the most efficient approach to perform the simulations of index-1 DAEs, while DAEs of index higher than 1 require specially-tailored integrators.

A non-trivial but important point needs to be stressed here. DAEs of index higher than 1 , often labelled high-index DAEs, present a pitfall to uninformed users of numerical methods. Indeed, one deploying a classical implicit integration method on a high-index DAE may observe that the implicit integration method converges reliably and be mislead into believing that simulations of the DAE model can be reliably computed. In order to clarify this issue, let us consider the following example, based on a linear, high-index DAE.

Example 8.7

In this example, we are interested to observe the result of "naively" deploying a classical implicit integration scheme on a high-index DAE. We consider again the fully-implicit DAE of Example 8.5, i.e.

(8.15)f(x˙,z,x,α)=[x˙1zx˙2x1x2α]

The reader can easily verify that (8.15) is not an index-1 DAE. We observe that we can rewrite (8.15) in the linear form Ev˙=Av+Bα, where

v=[xz],E=[100010000],A=[001100010],B=[001].

We are interested now in naively deploying an implicit Euler scheme of step length h on this DAE, yielding the steps:

1hE(v+v(t))=Av++Bα(t+h)

where v+is an approximation of the state at time v(t+h), i.e. v+v(t+h). It can be verified that the true trajectories v(t+h) satisfy:

E[1h(v(t+h)v(t))+h2x¨(τ)]=Av(t+h)+Bα(t+h)

for some τ[t,t+h]. We can then consider the one-step integration error e+=v+v(t+h) given by:

e+=h22(EAh)1Ex¨(τ).

For DAE (8.15), matrix h22(EAh)1E reads as:

h22(EAh)1E=12[0h0000h10],

such that the integration error is of order O(1), i.e. much worse than the integration error expected from the implicit Euler method, which is of order O(h2).

This simple example reveals that, even though a classic implicit integration scheme deployed on high-index DAEs (8.15) can in some cases reliably deliver state trajectories, their lousy numerical accuracy typically makes them actually meaningless as simulations of the DAE model. The difficulty with DAE (8.15) stems from its index larger than 1 . These observations must be taken as a warning that while one can sometimes deploy a classical implicit integration scheme on a high-index DAE without observing notable numerical difficulties, the resulting trajectories are typically nonetheless senseless. Hence, good practice in numerical optimization dictates that the index of a DAE ought to be systematically checked before tackling it via classical integration methods.

Because index-1 DAEs are significantly easier to treat than high-index DAEs, it is common in numerical optimal control to avoid DAEs of index larger than 1. Unfortunately, the index of a DAE stems from the nature of the physical system it models and cannot be decided. However, a treatment of high-index DAEs generally allows one to ultimately treat them as index-1 DAEs. We will cover this question next.

8.3 Index reduction

As detailed in the previous Section, index-1 DAEs are simpler to treat numerically than high-index DAEs, as index-1 DAEs can be approached using standard implicit integration methods. This observation motivates the deployment of procedures for reducing the index of an arbitrary high-index DAE into an index-1 DAE, a procedure labelled index reduction. Index-reduction proceeds very similarly to the procedure leading to assess the index of a DAE, i.e. via time-differentiation of the DAE (or of some parts of the DAE) until a DAE of index 1 is obtained. In order to explain this further, let us detail it on our illustrative example (8.1).

Example 8.8

We consider again the semi-explicit DAE (8.13), i.e.

(8.16a)p˙=v,(8.16b)v˙=αm1zp+gE3,(8.16c)0=12(ppL2)=g(x,z,α),

which is in a semi-explicit form. Similarly to the index evaluation presented in Example 8.6, i.e. we consider the time-derivatives of the algebraic equation (8.16c)

g˙(x,z,α)=pv=0, and g¨(x˙,x,z,α)=pv˙+vv=0

One can then easily verify that the new DAE:

(8.17a)p˙=v,(8.17b)mv˙=αzp+mgE3,(8.17c)0=pv˙+vv=g¨(x˙,x,z,α),

is of index 1. Alternatively, it is useful to put (8.17) in a more implicit form:

(8.18a)p˙=v,(8.18b)[mpp0][v˙z]=[α+mgE3vv],

which shows unambiguously that the state derivatives v˙ and p˙ as well as the algebraic state z can be computed for any state v,p and input α as long as p0. This observation tells us without further investigation that (8.18) is an "easy" DAE, i.e. of index 1.

Index-reduction procedures can be fairly intricate to deploy on very complex models. For the sake of completeness, let us report here a recipe for performing the index-reduction on any semi-explicit DAE:

x˙=f(x,z,α)0=g(x,z,α)
  1. Check if the DAE system is index 1 (i.e. gz full rank). If yes, stop.
  2. Identify a subset of algebraic equations that can be solved for a subset of algebraic variables.
  3. Perform a time-differentiation on the remaining algebraic equations that contain (some of) the differential variables x. Terms x˙ will appear in these differentiated equations.
  4. Substitute the x˙ with their corresponding symbolic expressions f(x,z,α). This generates new algebraic equations.
  5. With this new DAE system, go to step 1.

Our discussion on index reduction would not complete if we omit the question of consistency conditions. To understand this issue, consider the indexreduced DAE developed in Example 8.8, which takes the form:

(8.19a)x˙=f(x,z,α)(8.19b)0=g¨(x˙,x,z,α)

One needs to observe here that while a solution to the original DAE (8.16) in the form

(8.20a)x˙=f(x,z,α)(8.20b)0=g(x,z,α)

is obviously also a solution for the index-reduced DAE (8.19), the converse is not necessarily true, i.e. a solution of the index-reduced DAE (8.19) is not necessarily a solution to the original DAE (8.20). To understand this statement, one simply ought to imagine a trajectory that is solution of (8.19a), and for which

(8.21)g(x(t),z(t),α(t))=g(x(0),z(0),α(0))+tg˙(x(0),z(0),α(0))=0

holds. This trajectory clearly satisfies (8.19b) but not (8.20b). Equation $(8.21) $additionally reveals that the issue is not related the DAEs themselves, but rather to the initial conditions chosen for the simulation of the DAEs. Indeed, simply selecting the initial conditions x(0) such that

(8.22)g(x(0),z(0),α(0))=0, and g˙(x(0),z(0),α(0))=0

ensures that the trajectories of the index-reduced DAE are solution of the original one. More generally, enforcing

(8.23)g(x(t0),z(t0),α(t0))=0, and g˙(x(t0),z(t0),α(t0))

at any time t0 on the trajectory guarantees a simulation run with the indexreduced DAE is a valid simulation of the original DAE. Conditions that guarantees the validity of the simulation performed on the index-reduced DAE, such as (8.23), are labelled consistency conditions.

In the context of optimal control based on an index-reduced DAE, consistency conditions are crucial when the trajectories of the differential states of system do not have (fully) prescribed initial or terminal values. In such a case, the consistency conditions must be adequately enforced within the optimal control problem. E.g. an optimal control problem involving our index-reduced DAE (8.18) having free initial or terminal states can e.g. be written as:

(8.24)minv,p,z,α0Tg(v,p,z,α)dt subject to p˙=v (Differential equ.), mv˙=αzp+mgE3 (Differential equ.), 0=pv˙+vv(Algebratic equ.), 0=p(t0)v(t0) (Consistency cond.), 0=p(t0)v˙(t0)+v(t0)v(t0) (Consistency cond.) 

for any t0[0,T].

It ought to be underlined here that imposing some constraints on the initial and/or terminal state trajectories in conjunction with imposing the consistency conditions must be done with great care in order to avoid generating a redundant set of constraints in the OCP. As a trivial example of this difficulty, imposing e.g. the initial states in (8.24) in addition to the consistency conditions with t0=0 would clearly over-constrain the initial state values p(0),v(0). This issue can become significantly more involved in less obvious scenarios, such as e.g. in periodic OCPs, where the initial and terminal states are free but must satisfy a periodicity constraint of the form x(0)=x(T). Handling the consistency conditions and the periodicity constraints together in the OCP without generating an over-constrained problem can then become fairly involved.

The consistency conditions can in principle be enforced at any time t0 in the time span considered by the OCP. However, in some cases the selection of the time t0 for imposing the consistency condition is not arbitrary. Indeed, one ought to observe that the combination of the index-reduced algebraic constraint and of the consistency conditions, i.e.

(8.25)g¨(x(t),z(t),α(t))=0(8.26)g˙(x(t0),z(t0),α(t0))=0(8.27)g(x(t0),z(t0),α(t0))=0

ensure mathematically that

g(x(t),z(t),α(t))=g(x(t0),z(t0),α(t0))+(tt0)g˙(x(t0),z(t0),α(t0))=0

holds at any time t. However, when the DAE dynamics are handled via numerical integration, numerical errors tend to accumulate over time such that g(x(t),z(t),α(t))=0 can be less accurately enforced at times that are distant from t0. From this observation one ought to conclude that if the solution to an OCP is e.g. more important at the beginning of the time span the OCP covers, say [0,T], then the consistency conditions ought to be enforced in the beginning of the time span, i.e. t0=0. This situation occurs in Nonlinear Model Predictive Control (NMPC), where the first control input q0 delivered by the OCP provides the control input to be deployed on the real system, such that the accuracy of the solution in the beginning of the time interval it covers is the more important than later in the horizon.

Conversely, if the OCP implements a Moving Horizon Estimation (MHE) scheme, then the differential state obtained at the very end of the time span covered by the OCP delivers a state estimation to e.g. an NMPC scheme. In such a case, the accuracy is most important at the very end of the time interval, such that the consistency conditions are best imposed at t0=T.

8.4 Direct Methods with Differential-Algebraic Equations

We will now turn to discussing the deployment of direct optimal control methods on OCPs involving DAEs. For the reasons detailed previously, we will focus on OCPs involving index-1 DAEs, possibly arising from an index-reduction of a high-index DAE.

8.4.1 Numerical Solution of Differential-Algebraic Equations

In this Section, we will briefly discuss the numerical solution of DAEs. As hinted above, index-1 DAEs are significantly simpler to treat numerically than high-index ones, and are therefore often preferred in optimal control. In this Section, we will focus in the index-1 case.

Though low-order methods generally offer a poor ratio between accuracy and computational complexity and higher-order integrators should be preferred, let use nonetheless start here with a simple m-step implicit Euler scheme for the sake of illustration. For e.g. a semi-explicit DAE

x˙=f(x,z,α),0=g(x,z,α),

the m-step implicit Euler scheme computes a numerical simulation x(tk+1,βk,qk) of the model dynamics over a time interval [tk,tk+1] from the initial state βk and the constant input qk via the following algorithm.

Algorithm 8.13 (Implicit Euler integrator). Input: initial value βk, input qk and times tk,tk+1 Set v=qk, and h=(tk+1tk)/mfor i=0 to m1 do Solve

x+=v+hf(x+,z+,qk)0=g(x+,z+,qk)

for x+,z+via a Newton iteration, set vx+. end forOutput: x(tk+1,βk,qk)=v

A similar approach can be deployed using any implicit integration method, such as an IRK4 integrator.

A fairly efficient and useful type of implicit integrator already introduced in Section 7.3 is the orthogonal collocation approach. Let us consider the building of the collocation equations for DAEs in a generic implicit form

(8.29)f(x˙,x,z,α)=0

on a time interval [tk,tk+1], with initial value βk and a constant input qk. The differential states are, as in the ODE case described via polynomials p(t,vk), with t[tk,tk+1] linearly parametrized in vk such that:

  • the polynomial interpolation meets the initial value, i.e.:
(8.30)p(tk,vk)=vk,0=βk
  • the DAE is satisfied in the collocation times tk,i for i=1,,d, i.e.:
(8.31)f(p˙k(tk,i,vk),pk(tk,i,vk)=vk,i,zk,i,qk)=0,i=1,,d

We can gather these requirements in the compact implicit equation:

(8.32)ck(vk,zk,qk,βk)=[vk,0βkf(p˙k(tk,1,vk),vk,1,zk,1,qk)f(p˙k(tk,d,vk),vk,d,zk,d,qk)]=0.

The same observations as for the semi-explicit case hold for the general case.

One ought to observe that the discretized algebraic states zk,i appear only for the indices i=1,,d in the collocation equations, while the discretized differential states vk,i appear for the indices i=0,,d. I.e. the discrete algebraic states zk have one DoF less that the discrete differential states vk. The extra DoF granted to the differential state is actually required in order to be able to meet the initial value βk of the differential state trajectories, while the initial value of algebraic state trajectories cannot be assigned as they are already defined implicitly by the DAE, subject to the imposed state initial value βk and input qk. This observation is most obvious in the semi-explicit case, where for a given state initial value βk and input qk, the initial value for the algebraic state is implicitly given by g(βk,z(tk),qk)=0.

For the sake of completeness, let us provide the algorithm for a collocation-based integrator for index-1 DAEs.

Algorithm 8.14 (Collocation-based integrator). Input: initial value βk input qk, initial guess vk,zk and times tk,tk+1 Solve

(8.33)ck(vk,zk,qk,βk)=0

for vk,zk via a Newton iteration Output: x(tk+1,βk,qk)=pk(tk+1,vk)

It is interesting to observe here that while the algorithm ought to receive an initial guess for the discrete algebraic states zk, it receives an initial value βk only for the differential state. It is also important to notice that the algebraic states zk can in principle be entirely hidden inside the integrator (even though they can be, of course, reported).

Sensitivities of the integrators

The computation of the sensitivities of an implicit integrator such as (8.14) can be done as detailed in Section 10.4(in referenced e-books). More specifically, if we label wk=[vkzk], the collocation equation (8.36) in algorithm 8.14 is typically solved using a (often full-step) Newton iteration:

(8.34)wk=wkck(wk,qk,βk)1wkck(wk,qk,βk)

The sensitivities are then provided at the solution ck(wk,qk,βk)=0 by:

(8.35a)wkqk=ck(wk,qk,βk)1wkck(wk,qk,βk)qk(8.35b)wkβk=ck(wk,qk,βk)1wkck(wk,qk,βk)βk

It is important to note here that a factorization of the Jacobian matrix ck(wk,qk,βk)wk is already computed for the Newton iterations (8.34) and the last factorization can be readily reused at the end of the iteration to form the sensitivities (8.35). The computational complexity of obtaining the sensitivities consists then only of the computation of the matrices ck(wk,qk,βk)βk and ck(wk,qk,βk)qk and the matrix products in (8.35). A collocation-based integrator with sensitivities then reads as:

Algorithm 8.15 (Collocation-based integrator with Sensitivities). Input: initial value βk input qk, initial guess vk,zk and times tk,tk+1 Solve

(8.36)ck(vk,zk,qk,βk)=0

for vk,zk via a Newton iteration Compute (8.35) Form:

(8.37)x(tk+1,βk,qk)βk=pk(tk+1,vk)vkvkwkwkβk(8.38)x(tk+1,βk,qk)qk=pk(tk+1,vk)vkvkwkwkqk

Output: x(tk+1,βk,qk)=pk(tk+1,vk), and x(tk+1,βk,qk)βk,x(tk+1,βk,qk)qk where vkwk=[I0] is constant.

We can now turn to the deployment of Multiple-Shooting on DAE-based optimal control problems.

8.4.2 Direct Multiple-Shooting with Differential-Algebraic Equations

diagram

Figure 8.4.1: The diagram

In the context of Multiple-Shooting for DAE-based optimal control problems, the implicit numerical integration schemes detailed above are interacting with the NLP solver tackling the NLP resulting from the Multiple-Shooting discretization. We illustrate this interaction in Figure above. The NLP solver is then responsible for closing the shooting gaps, i.e. enforcing the continuity conditions:

x(tk+1,βk,qk)βk+1=0

for k=0,,N1, and solving the set of algebraic equations that capture the conditions of optimality. It is interesting to observe here that the overall process can be then construed as a "two-level Newton scheme", where the upper level solve the KKT conditions (e.g. using the relaxed KKT obtained in the primal-dual interior-point approach, or using SQP iterations) and the lower level solves the equations underlying the numerical integration (e.g. (8.36)). The NLP solver passes the discrete states and inputs βk,qk, which become inputs to the numerical integration algorithms, while the numerical integration algorithms report to the NLP solver the end states of the simulations x(tk+1,βk,qk) and their sensitivities.

One ought to observe here that the algebraic state dynamics can in principle be totally "hidden" inside the integrator scheme, and not reported at all to the NLP solver. In that sense, implicit integrators in general perform an elimination of the algebraic variables present in the dynamics, and hide their existence to the NLP solver.

Another crucial observation to make here is that no continuity condition nor initial condition is enforced on the algebraic state trajectories z(t). Indeed, for a given differential state trajectory x(t) and input profile α(t), the algebraic state trajectories z(t) are entirely defined via the DAE (e.g. by g(x,z,α)=0 in the semi-explicit case), such that an extra continuity condition imposed on z(t) would yield an over-constrained problem. As a matter of fact, if a discontinuous input parametrization is used (such as e.g. piecewise-constant), the algebraic state trajectories z(t) can be discontinuous at the time instants tk corresponding to the multiple-shooting time grid. E.g. in the semi-explicit case and if the algebraic equation g(x,z,α)=0 depends on α, at the time instants tk, a discontinuous input typically requires z(t) to also be discontinuous.

As mentioned previously, the integrator can hide the algebraic variables from the NLP solvers and keep them as purely internal. However, one may want to use these variables in the cost function of the OCP, or impose some inequality constraints on them. In such a case, the algebraic states ought to be reported to the NLP solver, where they are regarded as functions of the decision variables βk,qk.

As illustrated in Figure 8.4.1, Multiple-Shooting with implicit integrators can be viewed as a two-level Newton scheme, where algebraic conditions are solved at two different levels. A natural alternative to this setup is then clearly to introduce the algebraic conditions underlying the numerical integrators into the NLP, and leave them to be solved by the NLP solver. Doing so leads us back to the Direct Collocation scheme, which we revisit next in the context of DAE-based optimal control problems.

8.4.3 Direct Collocation with Differential-Algebraic Equations

We will focus now on the deployment of the Direct Collocation method on such DAE-based optimal control problems. The principle here is extremely similar to those detailed in Section 7.3. However, there are a few important additional specific details arising from the presence of algebraic states and equations that need to be properly covered here. Let us briefly recall here the core principles of the direct collocation method. As detailed earlier in Section 7.3 and briefly recalled in Section 8.4.1 above, the differential state trajectories are approximated on each time intervals [tk,tk+1] via the polynomials pk(t,vk) linearly parametrized by the set of variables vkRn(d+1). For an explicit ODE x˙=f(x,α), the collocation equations then enforce:

  • the continuity conditions of the differential states at the times tk for k=0,,N1
(8.39)pk(tk+1,vk)βk+1,0=0,
  • the state dynamics at the times tk,i for k=0,,N1 and i=1,,d
p˙k(tk,i,vk)=f(vk,i,qk)

Additional conditions are typically added as boundary conditions, e.g. v0,0x0=0 to enforce the initial condition of the state trajectories.

The extension of the collocation equations for a semi-explicit DAE

x˙=f(x,z,α)0=g(x,z,α)

follow the exact same philosophy, namely the collocation equations enforce:

  • the continuity conditions of the differential states via (8.39) at the times t0,,N1.
  • the state dynamics at the times tk,i for k=0,,N1 and i=1,,d via
(8.40a)p˙k(tk,i,vk)=f(vk,i,zk,i,qk)(8.40b)0=g(vk,i,zk,i,qk)

The collocation equations for a semi-implicit DAE therefore read as:

(8.41)ck(vk,zk,qk,vk+1)=[p˙k(tk,1,vk)f(vk,1,qk)g(vk,1,zk,1,qk)p˙k(tk,d,vk)f(vk,d,qk)g(vk,d,zk,d,qk)pk(tk+1,vk)vk+1,0]=0.

for k=0,,N1.

A few details ought to be properly stressed here.

  1. similarly to the observations made in Section 8.4.2, no continuity condition is enforced on the algebraic states, hence (8.39) applies to the differential state trajectories alone.
  2. one ought to observe that the discretized algebraic states zk,i appear only for the indices i=1,,d in the collocation equations, i.e. the discrete algebraic states have one DoF less that the discrete differential states vk,i which appear with the indices i=0,,d in the collocation equations. The extra DoF granted to the differential state is actually required in order to be able to impose the continuity of the differential state trajectories, while the algebraic state trajectories are not required to be continuous. When building the NLP arising from a discretization of a DAE-based OCP using direct collocation, one ought to make sure that the adequate number of discrete algebraic states and discrete differential states are declared to the NLP solver. Indeed, e.g. introducing by mistake the unnecessary extra variables zk,0 can create numerical difficulties in the solver, as these variables would be "free" in the NLP and their values not clearly fixed by the problem.

Building the collocation equations for DAEs in a generic implicit form

f(x˙,x,z,u)=0

is a natural generalization of the constraints used in the case of a semi-explicit DAE. In the general case, the collocation equations simply read as:

ck(vk,zk,qk,vk+1)=[f(p˙k(tk,1,vk),vk,1,zk,1,qk)f(p˙k(tk,d,vk),vk,d,zk,1,qk)pk(tk+1,vk)vk+1,0]=0

for k=0,,N1. The same observations as for the semi-explicit case hold for the general case.

Released under the MIT License.