Optimal input design for multibody systems by using an extended adjoint approach

We present a method for optimizing inputs of multibody systems for a subsequently performed parameter identification. Herein, optimality with respect to identifiability is attained by maximizing the information content in measurements described by the Fisher information matrix. For solving the resulting optimization problem, the adjoint system of the sensitivity differential equation system is employed. The proposed approach combines these two well-established methods and can be applied to multibody systems in a systematic, automated manner. Furthermore, additional optimization goals can be added and used to find inputs satisfying, for example, end conditions or state constraints.


Introduction
The problem of optimal input design plays a key role when considering an experiment in order to perform a parameter identification. Poorly planned experiments can cause a waste of time and resources and yield little useful information. The linkage between the experiment and modeling world is called design of experiment (DOE). If the model knowledge is used for designing the experiments, then often the term model-based DOE can be found in the literature [5]. One of the first authors dealing with the topic of designing experiments was R.A. Fisher in his substantial work The Design of Experiments [4]. Although the importance and applicability onto the problem of optimal input design was known only to some extent, many recent papers refer to his work. Fisher stated that the basic problem of DOE is to decide which pattern of factor combination will best reveal the properties of the response B W. Steiner wolfgang.steiner@fh-wels.at 1 and how this response is influenced by these factors. The term optimal input design emerges from the work of Mehra [10,11], who worked on linear discrete-time systems. There, the most important requirement for designing an input was to generate a system output allowing one to determine system parameters featuring a minimum of variance. Morelli [12] developed a method for generating optimal input signals utilizing basic statistics, including the theory of maximum likelihood estimates for parameters. On this basis, Morelli [13] showed the practicability of the method, where system inputs for flight tests with an F-18 HARV (high alpha research vehicle) are determined. Recent publications include different fields of application, for example, Jauberthie et al. [7,8] considered a model of an aerodynamic problem. Therein the ideas of Morelli [12] are used to generate an optimal input for identifying aerodynamic parameters of an aircraft. An extensive work on optimal input design in the field of chemistry and also on DOE in general was done by Franceschini [5]. According to this work, a model-based DOE is characterized by: • The explicit use of the model equations (including any constraint) and current parameters to predict the "information content" of the next experiment (through the evaluation of some suitable objective function), and • the application of an optimization framework to find a numerical solution of the resulting problem.
Another application dealing with optimal input design is that of process control. Chianeh [3] investigated models of tank systems fed by a pump. The goal was to determine the flow exponent used in Bernoulli's law. Keesman [9] considered optimal input design for choosing between model structures or model discrimination. A further topic related to optimal input design is that of optimal sensor placement. In the work by Castro-Triguero [2], several methodologies for computing a minimal set of sensor locations were investigated in order to get the required information for health monitoring of bridge structures. The latter mentioned approaches cannot be applied directly to the model of mechanical systems. Therefore, in this paper, we show how the process of optimal input design can be systematically applied for mechanical systems. As the adjoint method provides outstanding performance in the field of optimal control, this method is used for computing the update direction during the optimal input iteration process. First, a proper performance measure or cost functional for determining optimal inputs is defined via statistical context. For further analysis, the system of sensitivity differential equations is derived, and also the adjoint system of the original system is extended by these new terms.

Theoretical background
For simplicity, the model equations investigated in this work are first-order ordinary differential equations (ODE), and therefore a set of minimal coordinates is used. Nevertheless, it is possible to formulate the entire process for more general differential algebraic equations and therefore for models with redundant coordinates. The model equations can be written aṡ where x(t) ∈ R Nx is the vector of state variables, u(t) ∈ R Nu is the vector of model inputs, and b ∈ R n is the vector of model parameters. In order to compare the simulation result with measurements, a vector of model outputs y(x) is defined such that it matches with measured outputsỹ(t) ∈ R m .

System sensitivity analysis
Analyzing a system reaction to small changes in the system parameters at a special time point t i results in the sensitivity matrix In case of investigating a multibody system, this analysis can be performed by forming the derivatives of the output vector y with respect to the system parameters b. Therefore, Eq. (1) is differentiated with respect to each parameter. In the following, the abbreviations x b j , y x , f b j , and f x are used instead of ∂x ∂b j , ∂y ∂x , ∂f ∂b j , and ∂f ∂x , respectively. Hence, the sensitivity equations readẋ where y b j (t i ) equals the j th column in the output sensitivity matrix S(t i ). In the case of more than one unknown parameter, the system of differential equations for the state variables and for the sensitivities can be written by ⎡ . .
The Jacobianf z of the extended system, which is required for further computations, reads

Maximization of the information content in experimental measurement data
When dealing with optimal input design, first of all, the term "optimality" has to be clarified. As Morelli defined in [12], optimal inputs minimize the parameter standard errors during model parameter estimation with a maximum likelihood estimator. In other words, the information contained in experimental measurement data has to be maximized. Hence, a proper measure, or cost functional, can be constructed by using a norm of the Fisher information matrix M (see [12]): Here, N s is the number of samples taken during the measurement, and R is the discrete noise covariance matrix, which is unknown prior to the optimization process. A very common way is to assume no correlation among the system outputs and moreover that the variance of all system outputs is equal. Using this assumption, R reduces to the identity matrix I.
In [1], several norms or optimality metrics are suggested for optimal input design. Investigating the determinant (D-optimality) or eigenvalues (E-optimality) of M in a cost functional does not allow us to apply straightforward variational calculus. Hence, the so called A-optimality is chosen, which incorporates the trace of M. Moreover, common optimization algorithms search for the minimum of a cost functional J (u). Therefore, the maximization of the information content leads to a cost functional using the negative trace of M.
For further derivations, the cost functional is defined as a continuous function. Instead of forming the sum in Eq. (6), the inner product of columns of S is integrated over time. The resulting cost functional to be minimized then reads

The adjoint method
Recalling that the cost functional in Eq. (7) uses the system sensitivities and hence outputs of the extended system of Eq. (4), optimal input design can be seen as the standard problem of optimal control for the extended system. In previous publications of the authors [14,15], the adjoint method is presented to be the most efficient way to solve such problems. Since the cost functional in Eq. (7) depends on the system sensitivities and therefore on the states z of the extended system in Eq. (4), the cost functional can be defined as follows: The problem is to find control variables u(t) that minimize this function. In order to provide a search direction for the optimization process, the variation of the cost functional with respect to the parameters has to be evaluated. The starting point of the adjoint method is to add the system equations in Eq. (4) to the integrand in Eq. (8). Hence, the extended cost functional reads Since the system equations are satisfied, the actual value of J does not depend on the selection of the functions p(t). Introducing the Hamiltonian Eq. (9) becomes For a given forward solution z(t) of the system equations (4) with control variables u(t) and fixed parameters b, the variation of u about δu results in variations of z(t) about δz(t). This again results in a variation of the functional J about δJ . Considering the first-order terms only, δJ is given by where H u and H z stand for the partial derivatives of H with respect to the vector of system inputs u and states of the extended system z, respectively. In order to avoid the computation of the variations of z, the last term of Eq. (12) is integrated by parts, and therefore the variation of the cost functional reads Herein, the variation δz(0) = 0 is already neglected as the initial conditions are prescribed independently from the actual choice of parameters. Now, in order to eliminate the term multiplied with δz, a system of adjoint equations for the adjoint variables p(t) can be formed.
The adjoint system readsṗ This set of equations may be solved backwards in time since there is only an initial condition at time t = t f . At this point, there is no constraint on the adjoint states at t = t f , and therefore they can be chosen arbitrarily. An option presented in [14] is to add a further term to the cost function, which allows us to consider end conditions for the system states z. This term is commonly denoted as a scrap-function. Using Eq. (14), the variation of J is given by Eq. (13): In order to achieve the largest possible decrease of δJ , δu(t) is chosen in the direction of H T u . Due to nonlinearities in the cost functional, this direction is only valid near the current system input u(t). Therefore, the update has to be done incrementally by using with small numbers κ. Finding a value for κ that minimizes J may be done by applying an optimization scheme such as the classical linesearch algorithm.

Considering model input constraints
In most cases, maximizing the information content in measurements leads to a maximization of the energy put into the system under consideration. Therefore, the system inputs to be optimized have to be constrained in a way that applicable optimization results are generated. One main difficulty is the direct influence of such constraints on the optimization process. They insert further nonlinearities and therefore affect the convergence negatively. The approach chosen in this work is to transform the input in such a way that the transition , the authors in [6] propose to use the function for constraining the input u i . The term s is introduced in order to correct the slope at u i = 0 to ψ i (0) = 1. Another possibility for constraining inputs is to use the arctangent function again with s being the function to correct the slope at u i = 0. In Fig. 1, these two functions are displayed. As we prefer a function with distinct slopes in a wide range, the arctangent function is suggested as a constraining function, although the direct relation between ψ i ≈ u i is only valid in a very small region.

Parameter identification
The purpose of optimal input design is to generate the excitation for a subsequent parameter estimation. Application of the computed optimal input onto the real system leads to an optimal desired trajectory. Therefore, in the following, an approach utilizing the system sensitivities derived in Sect. 2.1 is presented.
In direct comparison to the optimization of inputs in the previous section optimizing, the parameters is less expensive. First of all, a suitable function, measuring the error of the simulation with respect to the experiment, has to be specified. Choosing the root mean square (RMS) error . . .
simplifies further derivations for the optimization procedure. Here, N s is the number of sampling points in the measurement, and Δy is the deviation from simulation data y to measured dataỹ at a time point t i . When arranging the output sensitivities at this time point Performing another differentiation on ∇E RMS and neglecting higher-order terms, the Hessian reads in which Δy T (t i ) is considered as constant in the case of the second term. With the exact solution for ∇E RMS and an approximation for ∇ 2 E RMS , the Newton method may be applied. Performing a sufficient number of iterations the optimal set of parameters can be determined.

Two-mass oscillator
As an introductory example, the relatively simple model of the two-mass oscillator in Fig. 2(a) is analyzed. The mass m 1 is excited by the force F (t), whereas this force is constrained to a maximum amplitude of F max by means of the constraining function presented in Sect. 2.4. Assuming that the position x 2 is measured during an experiment, a time history F (t i ) maximizing the information content with respect to the stiffness c 2 should be computed at discrete time points t i = {t 0 , t 1 , . . . , t f }, with t f = 1 s. For starting the iterative optimization process, the initial input is set to a constant value F 0 (t i ) = 0.5 · F max . The parameter setting used can be found in Fig. 2(b). In order to avoid a motion of the bodies at t > t f , a scrap function mentioned in Sect. 2.3 is specified such that the end velocities are set to v 1 (t f ) = v 2 (t f ) = 0. In Fig. 3(a), the convergence history for the input optimization, and in Fig. 3(b), the resulting constrained input signal is depicted. Due to the linear convergence rate of the gradient method, the convergence history shows quite poor but stable behavior. As this is only an example without a physical realization, the measurements necessary for the parameter identification are generated by simulation using the optimized excitation force F opt and the assumed stiffness coefficientc 2 = 1100 N/m. Due to various mechanisms, real sensor recordings provide biased signals. Therefore, the generated "measurements" are superimposed with zero-mean Gaussian noise. In Fig. 5, these measurements are presented for different standard deviations. For comparison, also the measurement using the initial excitation signal F 0 is displayed, which features smaller amplitudes and obviously leads to uncertainties of the parameter identification process. The velocity plot in Fig. 4 shows that the desired end conditions for v 1 and v 2 are fulfilled in the case of using F opt . Now, the main aim of optimal input design is to improve the quality of the parameter identification result, and as a side benefit, some constraints on the system states and inputs can be regarded. In order to show the advantage of the input generation for the actual example in Fig. 5, the RMS error evaluated for parameter values c 2 nearc 2 is shown. According to the left plot of RMS errors in Fig. 5(a), utilizing F opt does not improve the shape of the cost functional in comparison to F 0 when using the unbiased measurements. The more noisy the sensor recording used for computing E RMS , the more advantageous the optimal input F opt influences the shape of the RMS error used for parameter identification. Since this is only a one-dimensional optimization problem, no real benefit with regard to a speed up of the optimization process can be gained. Advantages may be the increased curvature of E RMS and the possibility to include end conditions on the system outputs.

Cart pendulum system
A system consisting of a translational moving cart and a pendulum mounted at its center of mass is studied next. Figure 6(a) shows the geometric description of the cart pendulum system. The cart is only allowed to move along the x-axis leading to a two-dimensional motion of the pendulum; see again Fig. 6(a). The coordinates chosen to describe the system motion are the cart position x c and the absolute pendulum angle ϕ resulting in the vector of generalized coordinates q = [x c , ϕ] T . Linear friction torque/force is considered for the revolute joint between cart and pendulum and also between ground and the cart, defined by the friction coefficients d p and d c . The distance from the revolute joint to the pendulum's center of gravity is abbreviated by s p . The numeric values used for simulation are defined in the table in Fig. 6(b).
The goal of the optimization is to find the excitation force F (t) that is best suited to generate measurements ϕ(t) allowing the identification of s p and d p . Again, the force F (t) is constrained to the interval [−F max , F max ]. Incorporating a scrap function using ϕ(t f ) = v c (t f ) =φ(t f ) = 0 prevents from movements at t > t f .
In Fig. 7(a), the convergence history for the input optimization is depicted. Figure 7(b) shows the resulting optimized input signal F opt , a signal F comp used for comparison purposes, and the initial signal F 0 . In Fig. 7(b), the signal F comp is chosen as a sine wave with the period T = t f , and the amplitude equals the force constraint |F comp | = F max . The plot of costs over iterations in Fig. 7(a) shows stable behavior and convergence at n = 200 iterations. At about n = 180 iterations, we can observe a jump, which may result from the nonlinearity of the model structure.
Unlike the previous example with only one parameter to identify, two parameters are now searched for. Comparing the error functions thus leads to three-dimensional plots or  contour plots, where we can study differences in shapes for different excitation signals. Although the signals of the considered system output ϕ (see Fig. 8) are comparable in amplitudes for F opt and F comp , the RMS errors differ significantly. In Fig. 9, the contour plots for F opt (a) and F comp (b) are depicted, where both parameters d p and s p are varied in the range of ±10% of the nominal value. The contours represent the values of E RMS at levels that are chosen equally for both plots in Fig. 9(a) and Fig. 9(b). Analyzing the plots leads to three main differences. First, a small rotation of the functional can be detected, where the elliptical contours are more aligned with the coordinate axes in the case of using F opt . Further, a slight compression of the functional can be noticed, which leads to a worse condition of the optimization problem and therefore to poorer convergence for F opt . Finally, the main difference of both functionals is the curvature and hence the decreasing distance of level curves. The RMS error depicted in Fig. 9(b) therefore is flatter and, as explained in Fig. 5 of the previous example, more sensitive to biased signals of the measurements.

Conclusion
The proposed method is mainly based on the assumption of optimality regarding the minimum standard deviation of identified parameters. It further allows us to set end conditions, which prescribe states at the end of an experiment. Looking at the example of the cart pendulum, this results in more robust measurement signals. Even when dealing with biased signals, more accurate parameter estimates are generated. Moreover, we assume that the proposed method can also handle different norms of the Fisher matrix M in order to not only optimize the information content but also the condition of the optimization problem.