Optimality-based dynamic allocation with nonlinear first-order redundant actuators

A scheme is proposed to induce optimal input allocation of dynamically redundant nonlinear actuators with first-order dynamics that satisfies suitable regularity and stability assumptions. The allocation scheme is parametrized by a cost function associated with the most desirable actuator configuration, and guarantees convergence to the desired set point as well as to the minimum of the cost function. The overall scheme is also shown to reduce, in some special cases, to a nonlinear version of a PI type of control action.


Introduction
Redundant actuators typically characterize situations where the number of actuators available for control purposes is larger than the number of plant outputs to be regulated. This redundancy must be tackled by allocating the redundant inputs according to a given optimality criterion, which may be accomplished by way of a static or a dynamic control allocation scheme [10]. The optimality criterion may be motivated by the specific application and it may be characterized in terms of minimization of desired cost functions, such as energy consumption, risk of failure or safety considerations, to mention just a few.
Control allocation techniques arise from the legacy of mostly application oriented solutions (e.g., in the aerospace [12] and underwater [6] fields). They have been a topic of intense theoretical research activity in recent years, leading to several important schemes, well surveyed in [10] and [11] and, among others, in [5,19] and references therein. Most existing allocation techniques address the problem of linear actuators and correspond to static solutions minimizing some cost function at each time instant (see, e.g., [11]), which is often captured by intuitive type of goals such as mid-ranging (see [8] and references therein).
Perhaps the first paper using allocator dynamics is [9], where a gradient-based law is proposed in the presence of static actuators and nonlinear costs to be optimized. Later, dynamics have been used in allocators only in [20,18] and their applications [18,4,3]. Follow-up derivations related to the linear case can be found in [7,16] where the allocation problem is cast using regulation theory. Besides these works, as easily understandable from the comments in [10, §2.2.6], not much work has been done within the dynamic input allocation context, to date.
In this paper we tackle nonlinear actuators and unlike [9] where static nonlinear actuators are considered, we consider actuator dynamics described by strictly proper nonlinear differential equations satisfying some mild regularity conditions which appear to be quite reasonable for a set of actuators (see Assumption 1 and Remark 2 for a detailed discussion of such conditions). We propose a static state feedback allocation law which corresponds to an intermediate step before the introduction of our main contribution, namely a dynamic output feedback scheme that solves a setpoint regulation problem, minimizing at the same time a desired optimality criterion. The considered class of actuators encompasses, for instance, dynamically redundant actuators the dynamics of which have been identified using Wiener-type models, as explained in greater detail in the numerical example of Section 5.
A preliminary version of this paper appeared in [14]. As compared to that preliminary work, we provide here a more accurate problem definition, which allows to streamline the statement of the main theorems and to avoid additional conditions required in [14], and we give the proofs of our main results, which were missing in [14]. In addition, numerical simulations are carried out here in the presence of unmodeled dynamics, demonstrating the intrinsic robustness of the approach. The paper is organized as follows. The considered allocation problem is introduced in Section 2. Section 3 provides the solution -in terms of a static state feedback -to the problem in the case of full information whereas the extension to a dynamic output feedback, i.e. when the actuator state is not available for feedback, is presented in Section 4. An application-motivated example is used to illustrate the performance of the proposed control law in Section 5. Finally, Section 6 contains all the technical derivations and the proofs of our main theorems. State Feedback Allocator

Plant
Contr ollerẋ Figure 1: The static state feedback scheme of Section 3.

Problem statement
As discussed in detail in, e.g., [5,10], the control architecture for over-actuated systems typically comprises three layers: a high-level motion control algorithm, a control allocation algorithm and a lowlevel control algorithm. These control layers, together with the plant, are interconnected according to a nested structure, possibly with the state of the plant being fed back to the high-level control algorithm. In this paper we focus on the control allocation task. As customary in input allocation problems, virtual controls τ ∈ R nτ , see [10], should be suitably assigned by an allocator governing n a actuators, with n a > n τ , which comprises redundancy, to reproduce commanded virtual controls τ c ∈ R nτ requested by the high level controller. In this paper we consider then a pool of n a actuators obeying a first-order (possibly coupled) dynamics: where Remark 1. While more general settings with actuators having higher-order state realizations are feasible, in this work we focus on the setting (1), which is already challenging due to the nonlinear coupling established by h(·). In practical cases, dynamics (1) often arise when data-driven identified actuator models with prescribed nonlinear structure are used in the allocation design, possibly with unmodeled dynamics. This is for example the case for the experiment described in Section 5.
Within our scheme, some robustness to unmodeled dynamics (such as, e.g., the fast electrical time constant of the actuators in Section 5) follows from intrinsic robustness of asymptotic stability under mild regularity conditions on the data of the control system (see, e.g., [17]). We make the following assumption on dynamics (1).
Assumption 1. The following holds: 1) the functions f (·) and g (·) are locally Lipschitz and h (·) is continuously differentiable; 2) the function g (·) is uniformly bounded from below, namely there exists a positive scalar g m such that g m ≤ min i σ i (g (x a )) for all x a ∈ R na , where σ i (g) denotes the i-th singular value of g; 3) the gradient of h, ∇h (x a ) ∈ R na×nτ is full column rank for all x a ∈ R na , namely, the matrix (∇h (x a )) T ∇h (x a ) is nonsingular everywhere.
Remark 2. Assumption 1 is very mild if one keeps in mind that system (1) corresponds to actuators dynamics. Intuitively, they convey the fact that actuator dynamics should be characterized by sufficiently regular (differentiable) functions and that no controllability loss should be possible during any operation range of the actuators. More specifically, item 1 conveys mild regularity assumptions to ensure existence and uniqueness of solutions and for the gradient of item 3 to be well defined. Item 2 resembles the fact that the external input u of the actuating system affects the actuator dynamics in a consistent way throughout the whole operating range. Finally, item 3 corresponds to the requirement that in any operating condition of the actuators, each virtual control in τ can be effectively changed by a variation of at least one of the actuators' states x a .
In this paper we address the problem of designing a dynamic allocator for actuators (1), which operates in feedback from the virtual control τ with the goal of guaranteeing suitable regulation of a commanded virtual control τ c ∈ R nτ , while ensuring that, at the steady-state, the actuators state x a minimizes a desired (possibly nonlinear) cost function, subject to the constraint that the (higher priority) virtual control assignment task is accomplished. While assuming availability of τ may not be reasonable for some applications, alternative feedback schemes from suitable estimates of τ could be envisioned. We believe that these schemes would clearly emulate the output feedback solution proposed here and not add much challenges to the underlying theory. Therefore, due to lack of space, they are not pursued here where the attention is focused primarily on dynamics (1). This issue may be, for instance, circumvented by augmenting the dynamics (1) with an observer that reconstructs τ from accessible information, provided the extended system satisfies Assumption 1. The above goals are formalized next. (1), a smooth cost function x a → J (x a ), and a regulation performance scalar parameter γ p > 0, design a controller in feedback from τ , such that for each commanded virtual control τ c and some non-empty set Ω, (i) stability: a suitable subset of the manifold where h(x a ) = τ c is uniformly asymptotically stable (ii) setpoint regulation: the closed-loop guarantees that:

Goal 1. Given actuators
1. the commanded virtual control is asymptotically tracked from any initial condition in Ω, i.e.
(iii) asymptotic optimality: for each τ c such that function x a → J(x a ) restricted to the level set where: which is well-defined from strict convexity.
Moreover, if Ω coincides with the entire state-space, then the controller is said to globally solve the problem.
The high level control algorithm is assumed to be suitably designed, in the sense that it allows to achieve desired (asymptotic) stability properties, with some intrinsic robustness, within a control scheme similar to the one in Figure 1 but in the absence of a control allocation algorithm. The output of the control allocation algorithm, namely the virtual control τ , which should reproduce the commanded virtual control τ c , is the input to the low-level controllers of effectors and actuators in the plant. The commanded virtual input must be then interpreted as set-point values for the lowlevel actuators of the controlled plant, which are dictated by a desired steady-state reference for the plant. Moreover, item (ii.1) ensures that the DC gain of the virtual allocator is unitary. Instead the requirement in (ii.2) corresponds to ensuring that the high-level controller sees a virtual linear first-order decentralized dynamics whose speed can be arbitrarily assigned via parameter γ p . In other words, the allocator ensures that its nonlinear dynamics is invisible to the higher level controllers and assigns a desirable bandwidth to the virtual actuator. Thus, the control allocator can be rendered transparent, from the I/O point of view, to the higher levels of the overall nested control architecture, by selecting γ p sufficiently large, hence preserving asymptotic stability enforced by the higher-level control block. Finally, item (i) ensures a necessary asymptotic stability of the allocator scheme, and item (iii) ensures that, whenever appropriate, the operating conditions of the nonlinear actuators are optimal according to the cost J and subject to the higher priority regulation constraint.

Static state feedback solution
We discuss in this section a first solution to Problem 1 based on a static state-feedback scheme, under the assumption that the state x a of the actuators (1) is available for measurement. This strong assumption will be removed and replaced by an incremental stability assumption in the next section.
The proposed state feedback controller is given by the following equation: which is well defined under Assumption 1 and comprises two input actions u y and u J . The first input u y ensures that the regulation performance in Problem 1 is met by the control system. The second input u J takes care of the optimality requirement (minimizing J). To suitably define u y and u J , we rely on the gradient ∇h(·) of the output map h(·) and on the following projection operator: which is well defined under Assumption 1. For completeness, a few basic properties concerning the projection operator ∇ ⊥ h(·) are recalled in Fact 1 reported in Section 6. Then, the two input actions in (3) are selected as: A block diagram representation of the static state feedback controller (3), (5) is shown in Figure 1.
With this scheme in place, we can state the next result.

Plant
Contr ollerẋ  by this solution (namely for any initial condition of the plant). This is not the case for the output feedback solution discussed in the next section, where an observer will be employed and the initial conditions of the controller state (namely of the observer dynamics) leading to the exact exponential response of item (ii.2) of Problem 1 will be shown to be the ones corresponding to zero observation error.

Dynamic output feedback solution
The state feedback solution of Section 3 may not be applicable if the state x a of the actuator (1) is not accessible. Moreover, steady-state errors may be experienced due to uncertainties of the model (1). Due to this fact, we develop here a dynamic output feedback solution essentially arising from the use of an open-loop observer for the actuators' dynamics (1), whose state will be shown to provide a useful integral action on the control loop. For the output feedback solution to be feasible, we need to impose the following incremental stability assumption [1] on dynamics (1), to ensure that the open-loop observer guarantees asymptotic convergence to zero of the observation error. See [1] for several discussions on Lyapunov characterizations of the following property.
Assumption 2. System (1) is incrementally stable, namely denoting by ξ(t, x 0 , u) the state response of (1) at time t from the initial condition x 0 , there exists a function β of class KL such that for any pair of initial conditions x a (0), x c (0), we have for all t ≥ 0, Note that, as discussed in [1], whenever the state equation of the dynamics (1) is linear (namely f (x a ) = A a x a and g(x a ) = B a for some matrices A a , B a ), Assumption 2 is equivalent to the exponential stability of the linear dynamics. Keeping in mind that in the context of Figures 1 and 2, (1) represents the dynamics of an actuator, while the plant dynamics is stabilized by an outer control loop, then this is a reasonable requirement.
In particular, the proposed dynamic control law, generalizing the static law (3), is chosen as: with x c ∈ R na denoting the internal state of the dynamic control law, where where ∇ ⊥ h(·) is defined in (4). A block diagram representation of the control system is represented in Figure 2.
When writing the controller dynamics in the form (6) and as shown in Figure 2, a peculiar feature of the controller is highlighted, corresponding to the fact that its action on τ can be interpreted  In the presence of uncertainties or unmodeled dynamics an open-loop approach to the observer design may not be the best choice in terms of robustness. As a matter of fact the control scheme could be modified to include a closed-loop observer, trading the nonlinear PI structure described in Remark 4 with robustness to model uncertainties, as detailed in the following remark.
Remark 5. Suppose that we can find a function (·, ·) and a positive definite continuously differen- ∀x c , x a ∈ R na , with u y , u J and u as in (6c), (6d) and (6), respectively. Then the control scheme proposed previously can be modified to admit an observer with output injection for the state x a , The advantage in the use of the term u e in (9) is that it may speed up the observer transient when the incremental dynamics is slow and it may reduce possible estimation errors arising from uncertainties affecting the model (1). However, the drawback arising from the use of the output injection term is that the peculiar PI-like structure highlighted in Remark 4 is destroyed.

Hydrodynamic Dynamometer Application
In this section the performance of the proposed dynamic control law is validated in the case of output feedback for a hydrodynamic dynamometer, see [13] for more detailed discussions on the actual experimental set-up. In internal combustion engine test benches a hydrodynamic dynamometer is typically employed to reproduce, at its shaft, values of the torque and, in some circumstances, of the speed that the engine would experience due to the actual driving conditions. Since in hydrodynamic dynamometers the resulting torque is generated mainly by friction of the rotating shaft with the water contained in the brake itself, its value depends on the water fill level in the working compartment, which is affected by two valves governing the water inflow and outflow, see Figure 3, which is taken from [13]. As a consequence the same torque may be achieved with different pairs of positions for the inlet and outlet valves. In [13] and [14] a data-driven model of a hydrodynamic dynamometer is described by means of a nonlinear Wiener-type model, whose dynamics have been identified from input/output steady-state measurements, together with a subsequent transient analysis provided by linear dynamics. The obtained model then consists of a linear (asymptotically stable, hence satisfying Assumption 2) time-invariant system followed by a static nonlinear mapping.
The design objective consists in regulating the virtual control of the dynamometer, i.e. the torque, to a desired reference value while achieving at the same time some optimality criterion. The latter can be motivated by several specific goals such as maximizing the inflow of water in the internal chamber, operating the dynamometer in a neighborhood of a desired working point (e.g. mid-ranging [8]) or minimizing the temperature difference between inlet and outlet valves, just to mention a few. Finally, in order to put the application into the right perspective of Figure 1, the high level controller providing τ c is not discussed here and it is typically provided by a simple PI feedback control action.
The actuator is described by equations of the forṁ where x a = [x a,1 , x a,2 ] T ∈ R 2 ,û = [û 1 ,û 2 ] ∈ R 2 and τ ∈ R is the virtual control. Deviating from the theory developed in the previous sections to demonstrate robustness of the proposed scheme, the inputsû are then obtained as the output of an asymptotically stable filter modeling the sufficiently fast electrical dynamics, which have been neglected in (10), i.e.u = (−1/T )û + (1/T )u, where u ∈ R 2 denotes the control action to be designed.
Note that A a Hurwitz implies that the nominal dynamics, corresponding to (10) withû = u, satisfies both Assumptions 1 and 2 so that we may apply our construction.
In the following numerical simulations the time constant T is set equal to 50 ms. The inputs u represent the references for the inlet/outlet water valve positions, the states x a denote the actual valve positions while the output τ is the resulting torque. The matrices of system (10) are described by The memoryless nonlinearity h (·) is assumed to be described by a parametric function of the form [13] h (x a ) = θ 2 + θ 1 arctan Physical constraints (maximal/minimal opening of valves) limit the action provided by the control input u, which is consequently bounded, namely |u i | ≤ 5 3 , i = 1, 2. The virtual control τ is normalized and it is such that 0 ≤ τ ≤ 1. Note that the presence of the input saturation, which is not invertible, represents a deviation from the results presented in Sections 3 and 4. To circumvent this issue, following a nonlinear version of the "observer anti-windup" technique of [2], we feed the saturated input signal also to the open-loop observer. The computation of the gradient of h(·) is straightforward from equation (11) and it is given by with ζ (x a ) := 1 + σ (x a ) 2 := 1 + θ 3 + c T x a + 1 2 x T a M x a 2 . For the actuator dynamics described by (10), the main objective is the regulation of the virtual control τ to a commanded virtual control τ c . In the following numerical simulations the cost function J(·) is defined as a weighed sum of two different terms, namely with the weights w 1 = 20, w 2 = 1 specifying the relative importance of the two objectives. In particular the first term ensures that the state x a is steered to the closest value to the working point x 0 = [1,1] that is on the level line such that τ = τ c whereas the second term guarantees that the state is not driven outside a safety region, the boundaries of which are imposed by the lower and upper bounds x and x, respectively. The first and the second graphs of Figure   It is interesting to note that the steady-state value of the virtual control τ is reached fast after a change in the desired value τ c while, after the convergence, the state x a starts sliding on the corresponding level line driven exclusively by the component u J , defined in (6d) with γ J = 0.2, towards the minimum of the problem (2). Note that different behaviors can be induced tuning the relative values of the parameters γ p and γ J . In this specific simulation the regulation task is favored (γ p = 20) with respect to the optimality criterion, generating an aggressive control action, see also Figure 6. In fact, Figure 6 displays the inputs u to the actuator dynamics corresponding to the first step in the commanded virtual control τ c , namely from 0.2 to 0.5 at time t = 2 s. It can be noted that both inputs are exploited up to their maximum allowable value (they are pushed in saturation by the controller) during the initial transient.

Proof of the main results
The aim of this section is to provide proofs of the main results of the paper. The employed arguments rely upon the nonlinear stability result for nonlinear systems of [15], stated in the more general setting of hybrid systems. Towards this end, few additional definitions are provided for completeness.
The interested reader is referred to [15] for more detailed insight and discussions. 2) attractive if there exists δ > 0 such that 2 x(0) ∈ (X + δB) ⇒ lim t→+∞ |x(t)| X = 0; 3) (locally) asymptotically stable (AS) if it is stable and attractive. Moreover, its basin of attraction B X is the largest set of initial conditions from which all trajectories converge to X ; 4) globally attractive if (14) holds for all δ > 0.
5) strongly forward invariant if all solutions starting in X remain in X for all times.
A compact set X c ⊂ R n is 1) uniformly attractive from a compact set K ⊂ R n , K ⊃ X c , if for each > 0, there exists T such 1 Note that in the case when the set X is compact, the definition at item 1 coincides with the standard definition of (local) stability. Given sets X , Y, we denote X + Y = {z : z = x + y, x ∈ X , y ∈ Y}. 2 We denote the distance |z| M of a point z from the set M as |z| M := inf w∈M |z − w|. that x(0) ∈ K ⇒ |x(t)| Xc ≤ , ∀t ≥ T ; 2) uniformly (locally) asymptotically stable (UAS) if it is stable and uniformly attractive from each compact subset of its basin of attraction B X ; 3) uniformly globally asymptotically stable (UGAS) if it is UAS with B X = R n .
Given a closed forward invariant set Y forẋ = f (x), each one of the above properties holds relative to the set Y if it holds for initial conditions restricted to Y.

Definition 2.
Given a continuous function f : R n → R n and a compact set K ⊂ R n , solutions (or trajectories) ofẋ = f (x) are uniformly bounded from K if there exists ∆ > 0 such that x(0) ∈ K implies |x(t)| < ∆ for all t ≥ 0. Moreover, given a subset X of R n , solutions (or trajectories) oḟ x = f (x) are uniformly bounded from X if they are uniformly bounded from each compact subset of X . If X = R n then trajectories are uniformly globally bounded (UGB).

Proof of Theorems 1 and 2
The following fact recalls well-known properties of the projection operator ∇ ⊥ h(x a ) introduced in (4).