Damper model identification using mixed physical and machine-learning-based approach

M. Zilletti; E. Fosco

doi:10.1017/aer.2023.8

Damper model identification using mixed physical and machine-learning-based approach

Part of: 47th RAeS European Rotorcraft Forum

Published online by Cambridge University Press: 03 March 2023

M. Zilletti

and

E. Fosco

Show author details

M. Zilletti*: Affiliation:
Leonardo S.p.a., Via Giovanni Agusta, 520, Cascina Costa, VA, Italy
E. Fosco: Affiliation:
Leonardo S.p.a., Via Giovanni Agusta, 520, Cascina Costa, VA, Italy
*: *Corresponding author. Email: michele.zilletti@leonardocompany.com

Article contents

Abstract
Symbols
Introduction
Lead-lag damper tests
Grey box model
Results
Conclusions
References

Rights & Permissions

Abstract

In this paper, the identification of a time domain model of a helicopter main rotor lead-lag damper is discussed. Previous studies have shown that lead-lag dampers have a significant contribution to the overall aircraft dynamics, therefore an accurate damper model is essential to predict complex phenomena such as instabilities, limit cycles, etc. Due to the inherently nonlinear dynamics and the complex internal architecture of these components, the model identification can be a challenging task. In this paper, a hybrid physical/machine-learning-based approach has been used to identify a damper model based on experimental test data. The model, called grey box, consists of a combination of a white box, i.e. a physical model described by differential equations, and a black box, i.e. regression numerical model. The white box approximates the core physical behaviour of the damper while the black box improves the overall accuracy by capturing the complex dynamic not included in the white box. The paper shows that, at room temperature, the grey box is able to predict the damper force when either a multi-frequency harmonic or a random input displacement is imposed. The model is validated up to 20Hz and for the entire damper dynamic stroke.

Keywords

Lead-lag damper model identification Grey box model identification Damper Nonlinear dynamic model

Type: Research Article
Information: The Aeronautical Journal , Volume 127 , Issue 1314 , August 2023 , pp. 1269 - 1286

DOI: https://doi.org/10.1017/aer.2023.8 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press on behalf of Royal Aeronautical Society

Symbols

1.0 Introduction

Lead-lag dampers are essential to guarantee the stability of helicopters as they increase the damping of the blade lead-lag motion [Reference Johnson1]. Due to their significant contribution to the overall aircraft behaviour, an accurate model of their dynamics is essential to predict complex phenomena such as instabilities, limit cycles, etc. [Reference Quaranta, Muscarello and Masarati2, Reference Titurus and Lieven3].

Lead-lag dampers have been widely studied and different architectures, such as hydraulic and elastomeric, to cite the most common ones, have been used in articulated rotors. The presence of the fluid and\or elastomeric materials have shown that the dynamics of these devices are inherently nonlinear and generally dependent on displacement, deformation rate and temperature [Reference Wallace, Wagg, Neild, Bunniss, Lieven and Crewe4–Reference Smith, Lesieutre, Szefi and Marr9]. The classical way to model the damper dynamics is by linearisation using the complex modulus. The damper is forced harmonically over a range of amplitudes, temperatures and frequencies (typically at the lead-lag resonance and\or at the blade passing frequency and higher order harmonics), and the complex modulus for each test condition is calculated from the imposed displacement and measured force signals. The model could be as simple as a look-up table incorporating all the available test conditions, or more complex, involving a set of functions that could be curve-fit to all or part of the data set [Reference Felker, Lau, Mclaughlin and Johnson10]. The approach used in this type of modelling is to calculate the damper’s linearised stiffness and damping for a given steady-state flight condition to perform stability analyses. This approach does not allow capturing the inherent nonlinear behaviour of the damper and is inadequate to describe its transient dynamic response. Additionally, a linearised damper model, used in a helicopter dynamic simulation tool, may lead to inaccurate stability assessments of the aircraft, and it would be unsuitable to predict nonlinear phenomena, such as limit cycles.

An alternative approach is to use a time domain model of the damper (often called the physical model or white box), where differential equations (DE), describing the physical phenomena involved in the damper dynamics, are integrated over time. In this case, the model aims to predict the time history of the damper force for a given input displacement. References [Reference Panda, Mychalowycz and Tarzanin6–Reference Jones, Russell and McGuire8] show some examples of physical models, which use simplified linear and nonlinear lumped parameter elements opportunely connected. The parameters are optimised by minimising the error between the model output and the experimental data. A more advanced model for a hydraulic damper is described by Eyres et al. in Ref. [Reference Eyres, Champneys and Lieven11] where the piston-fluid interaction and the dynamics of relief valves are considered. In general, the accuracy of physical models depends on the complexity of the mathematical model, which, especially for some damper architectures involving both elastomeric materials and hydraulic components, may become prohibitively complicated and impractical to use. Also, these models are specific to the particular damper architecture [Reference Worden, Hickey, Haroon and Adams12].

The approach, considered in this paper, uses a white box model with a limited number of degrees of freedom and physical parameters to obtain a reasonable approximation of the core behaviour of the damper. The accuracy is then augmented with a numerical model (called black box), whose parameters are learnt from the experimental data using a machine learning general approximation approach. As the opposite of the white box, the fitted parameters will not have a physical meaning in this case. The combination of the white and black box is referred to as grey box model [Reference Worden, Barthorpe, Cross, Dervilis, Holmes, Manson and Rogers13].

In this paper, firstly, the general dynamic behaviour of the damper based on the experimental data is described in Section 2. In Section 3 the derivation of the grey box model is described. The section is organised into two sub-sections dedicated to the derivation of the differential equations of the white box model and their parameters optimisation; the description and fitting of the black box model using two machine learning regression approaches, i.e. Gaussian process (GP) and neural network (NN). The model prediction is finally compared against the experimental data and results are presented in Section 4. Conclusions are given in Section 5.

2.0 Lead-lag damper tests

The lead-lag damper under consideration uses the particular characteristics associated with the action of a high-viscosity fluid and the deformation of an elastomeric material. The functionality of the damper can be represented by the simplified scheme in Fig. 1. The movement of the piston forces the high-viscosity fluid through orifices due to the pressure difference generated between the upper and lower chambers.

Figure 1. Simplified scheme of the damper.

A relief circuit, connecting the two chambers, allows the fluid passage if the pressure exceeds a threshold value. This is achieved by the use of relief valves calibrated by setting the preload on the valve springs. The reaction due to the deformation of the elastomeric material is represented by the spring element $K$ .

To gain a better understanding of the damper dynamics and collect the data needed for the model identification, characterisation tests have been carried out. The damper has been tested by grounding one of its terminals and imposing a displacement of the free terminal. Both the displacement and the force time histories have been recorded during the test. The simplest dynamic test consists in imposing a pure sinusoidal displacement signal $x$ given by:

(1)

\begin{align}x = A\sin\!\left( {\omega t} \right)\end{align}

where $\omega $ is the excitation frequency in rad/s and $A$ is the displacement amplitude in mm.

As a preliminary step, the experimental data (i.e. the measured damper force $f$ and the displacement $x$ signals) have been pre-processed. Both signals have been filtered to opportunely remove the high frequency noise with a zero-phase digital filter in order to prevent phase distortions. The filter order has been set to the minimum value that guarantees a good noise reduction minimising the alteration of the high frequency damper dynamics. The velocity $\dot x$ has been then obtained by differentiating the displacement signal using a centred differences filter [Reference Worden and Tomlinson14].

Figure 2 shows the displacement $x$ , the velocity $\dot x$ , and damper force $f$ measured during a laboratory test when a pure sinusoidal displacement, given in Equation (1), is imposed. As can be noticed, the velocity signal is affected by noise. This is due to the differentiation step that amplifies the high frequency noise content. With the reference to A-B-C-D markers annotated in Fig. 2, the dynamic behaviour of the damper can be described as follows:

A: The displacement and the force are at their maximum value and the velocity is almost zero.
A-B: As the displacement and the velocity decrease, the damper force generated by the piston-fluid interaction (the fluid is pushed through the orifices) becomes negative.
B: The relief valves are activated because the fluid pressure exceeds a threshold value.
B-C: Even though the relief valves are activated, (the fluid is free to flow in the relief circuit) the damper force continues to decrease, highlighting an elastic behaviour, until the displacement reaches a minimum.
C: The displacement and the force are at their minimum value and the velocity is zero, thus the fluid pressure becomes smaller than the threshold value so the relief valves are deactivated.
C-D: As the velocity increases the force move from a negative to a positive value.
D: The relief valves open again as the fluid pressure exceeds the threshold value.

The same behaviour can be identified in the force vs displacement and force vs velocity diagrams shown in Fig. 3. The velocity vs force diagram shows the presence of a nonlinear damping component and the finite area enclosed by the curve shows that the dynamic of the damper is characterised by a combination of stiffness and damping components. The presence of a stiffness component can be also noticed in the displacement vs force diagram.

Figure 2. Measured displacement (top), velocity (mid) and force (bottom) acquired in an experimental test at a single frequency.

Figure 3. Force vs displacement (right) and force vs velocity (left) diagrams of the signal in Fig. 2.

The full test campaign of the damper includes input displacement signals at different frequencies and amplitudes. A total of 96 experiments have been carried out. Figure 4 summarises the amplitude and frequency content of the input displacement for each test. The bars in the top plot show the peak-to-peak amplitude while the blue markers represent the mean value; the bottom plot shows the frequency content. As can be noticed from the graph, some of the experiment labels on the x-axis are marked in bold to indicate that they are part of the training set of the black box model, as discussed in the following section. The full test campaign includes input displacement signals characterised by harmonics at the lead-lag frequency, ${\omega _{lag}}$ , the main rotor frequency 1/rev, and two higher order harmonics, 2/rev and 3/rev at different amplitudes within the typical damper operating stroke limits.

Figure 4. Peak-to-peak and mean amplitude (top plot) and frequency (bottom plot) of the input displacement for each experiment; the colorbar shows the amplitude of the harmonic components. The experiment numbers in bold are part of the training data set.

3.0 Grey box model

Figure 5 shows the structure of the grey box model used for the estimation of the damper force starting from the input displacement $x\!\left( t \right)$ and velocity $\dot x\!\left( t \right)$ . The displacement and velocity sample at instant ${t_i}$ are input to the white box to produce an estimate ${\hat f_w}\!\left( {{t_i}} \right)$ of the damper force integrating the differential equation describing the damper dynamics. The past $n$ samples of the displacement ${\left[ {x\!\left( {{t_{i - n}}} \right) \ldots x\!\left( {{t_i}} \right)} \right]^T}$ , velocity signal ${\left[ {\dot x\!\left( {{t_{i - n}}} \right) \ldots \dot x\!\left( {{t_i}} \right)} \right]^T}$ and the white box output ${\left[ {{{\hat f}_w}\!\left( {{t_{i - n}}} \right) \ldots {{\hat f}_w}\!\left( {{t_i}} \right)} \right]^T}$ are passed to the black box to produce the final estimate of the damper force $\hat f\!\left( {{t_i}} \right)$ accounting for the damper dynamic not capture by the white box. In the next subsections, the derivation of the white and black boxes is described.

Figure 5. Grey box model scheme.

3.1 White box model

In this section, the differential equations (DEs) are derived and their parameters are optimised using the quantum particle swarm optimisation (QPSO) algorithm [Reference Sun, Feng and Xu15].

With reference to the scheme in Fig. 1, the pressure difference between the upper and lower chambers $\delta $ , can be described by the following differential equation [Reference Eyres, Champneys and Lieven11]:

(2)

\begin{align}\dot \delta = \frac{{1 + \zeta }}{{\beta \bar V\zeta }}\left[ {{A_p}\dot x - {\textrm{sign}}\!\left( \delta \right)\!\left( {{Q_0} + {Q_V}} \right)} \right],\end{align}

where $\beta $ is the fluid compressibility, $\bar V$ is the chamber’s mean volume and $\zeta $ is the average ratio between chambers. A similar approach has been used in the modelling of an automotive engine mount as described in Ref. [Reference Kyprianou, Giacomin, Worden, Heidrich and Bocking16]. The total fluid volume flow induced by the piston movement ${A_p}\dot x$ , where $\dot x$ is the piston velocity and ${A_p}$ is the piston area, is equal to the fluid flow through the orifices, ${Q_0}$ plus the fluid flow through the relief circuit ${Q_V}$ . The volumetric flow through the orifices ${Q_0}$ is a function of the pressure difference $\delta $ , and assuming a combination of laminar and turbulent flow characteristics, can be expressed as [Reference Eyres, Champneys and Lieven11]:

(3)

\begin{align}{Q_0} = \frac{{ - {D_1} + \sqrt {D_1^2 + 4{D_2}\left| \delta \right|} }}{{2{D_2}}},\end{align}

where ${D_1}$ and ${D_2}$ are the coefficients of laminar and exit loss respectively given by:

(4)

\begin{align}{D_1} = \frac{{128\eta {l_0}}}{{\pi {d^4}\;}},\;\;\;\;\;\;\;\;\;\;\;\;{D_2} = \frac{{{c_{exit}}\rho }}{{2{\pi ^2}{{\left( {\frac{d}{2}} \right)}^4}}},\end{align}

where $\eta $ is the dynamic viscosity, ${l_0}$ and $d$ are the orifice length and diameter respectively, ${c_{exit}}$ is the discharge coefficient and $\rho $ is the fluid density.

The volumetric flow rate through the relief valves ${Q_V}$ is a function of $\delta $ and the valve effective poppet displacement $\;{x_V}$ :

(5)

\begin{align}{Q_V} = R\frac{\gamma }{{1 + \gamma {x_V}}}\;{x_V}\sqrt {\left| \delta \right|},\end{align}

where $R$ and $\gamma $ are the relief valve coefficient loss and the secondary discharge coefficient, respectively. The valves poppet displacements ${x_1}$ and ${x_2}$ , of the two relief valves, one for each flow direction, are modelled as a second-order dynamic system as:

(6)

\begin{align}{m_V}{\ddot x_1} + {c_1}{\dot x_1} + {k_1}\!\left( {{x_1} - {x_{c1}}} \right) = {A_V}\delta,\end{align}

(7)

\begin{align}{m_V}{\ddot x_2} + {c_2}{\dot x_2} + {k_2}\!\left( {{x_2} + {x_{c2}}} \right) = {A_V}\delta,\end{align}

where ${m_v}$ is the valve moving mass and ${A_V}$ is the valve area. Also, ${x_{c1}}$ and ${x_{c2}}$ are the valve pre-compression displacements given by ${x_{c1}} = {F_{c1}}/{k_1}$ and ${x_{c1}} = {F_{c2}}/{k_2}\;$ where ${k_1}$ and ${k_2}$ are the valves spring stiffnesses and ${F_{c1}}$ and ${F_{c2}}$ are the preload forces. Finally, ${c_1}$ and ${c_2}$ are the valve damping coefficients. The effective poppet displacement ${x_v}$ in Equation (5) is given by:

(8)

$$\matrix{ {{x_v} = \left\{ {\matrix{ {\left| {{x_1}} \right|\;\;\;\;{\rm{if}}\;\delta {\rm{ < }} - {{{F_{c1}}} \over {{A_v}}}\;\left( {{\rm{valve}}{\mkern 1mu} \,{\rm{1}}{\mkern 1mu} \,{\rm{open}}} \right)} \cr {\left| {{x_2}} \right|{\mkern 1mu} {\mkern 1mu} {\rm{if}}{\mkern 1mu} \delta {\rm{ > }}{{{F_{c2}}} \over {{A_v}}}{\mkern 1mu} \left( {{\rm{valve}}{\mkern 1mu} \,{\rm{2}}{\mkern 1mu} \,{\rm{open}}} \right)} \cr {{\rm{0}}{\mkern 1mu} {\mkern 1mu} \,\,\,\,\,\,\,\,\,\,{\mkern 1mu} {\mkern 1mu} {\rm{Otherwise}}} \cr } } \right.} \hfill \cr } $$

The mechanism governing the elastomeric material deformation can be quite complex and dependent on environmental conditions such as temperature. In the current model, the elastic component has been approximated by a linear spring element in parallel with the piston. Even though more complicated differential equations may be required, the approach used here is to keep the white box as simple as possible and model the more complicated dynamics using a black box, as will be discussed in the next section. Under this assumption, the overall white box force estimation ${\hat f_w}$ is given by the force generated by the fluid pressure acting on the piston area plus the stiffness component leading to:

(9)

\begin{align}{\hat f_w} = {A_p}\delta \!\left( {\dot x} \right) + Kx\end{align}

where $K$ is the linear stiffness coefficient of the spring element in parallel with the piston. The differential Equation (9) describes the white box model able to provide a prediction of the damper force for an imposed damper displacement and velocity.

3.1.1 White box parameter optimisation

The majority of the physical parameters used to derive the white box can be obtained by design (e.g. mass, dimensions, etc.). However, for some of the parameters (e.g. stiffness, preload of the valves, effective piston area) it is difficult to have an accurate estimate because a direct measurement may be difficult to carry out or the value may be affected by environmental factors such as temperature. To overcome this issue, some of the parameters have been tuned using an optimisation algorithm that minimises the discrepancy between the output of the white box and the measured damper force during the experiments.

The cost function $J\!\left( \xi \right)$ to be minimised is defined as the sum of the $T$ normalised mean squared errors (NMSE) calculated for each $\tau $ experiment:

(10)

\begin{align}J\!\left( \boldsymbol\xi \right) = \mathop \sum \nolimits_{{\tau} = 1}^{\textrm{T}} NMS{E_{\tau} }\!\left( \boldsymbol\xi \right),\end{align}

where $NMS{E_{\tau} }$ is the normalised mean squared error (NMSE) of the ${\tau} $ -th experiment given by:

(11)

\begin{align}NMS{E_{\tau} }\!\left( \boldsymbol\xi \right) = \frac{{100}}{{N\sigma _f^2}}\mathop \sum \nolimits_{i = 0}^N {\!\left( {{f_{e,{\tau} }}\!\left( {{t_i}} \right) - \hat f_{\tau} ^w\!\left( {{t_i},\boldsymbol\xi } \right)} \right)^2},\end{align}

Where ${f_{e,{\tau} }}$ is the recorded experimental force, $\sigma _f^2$ is its variance and $\hat f_{\tau} ^w$ is the output of the white box model. Also, ${t_i}$ and $N$ are the $i$ -th time sample and the total number of samples, respectively. Finally $\boldsymbol\xi = {\left[ {{F_{c1}},{F_{c2}},K,{A_p}} \right]^T}$ is the vector of the selected tuning parameters. Since the time integration is computationally very demanding, it was unpractical to include all the experiments in the calculation of $J$ in Equation (10), therefore only some of the single frequency experiments with zero mean amplitude for a total number of $T = 8$ have been considered (specifically experiments 1, 4, 69, 68, 71, 74, 77 and 80). Finally, the optimal parameter vector ${\boldsymbol\xi _{opt}}$ can be expressed as:

(12)

\begin{align}{\boldsymbol\xi _{opt}} = \arg \mathop {\min }\limits_{\hat{\boldsymbol{\xi}}} J\!\left( \boldsymbol\xi \right)\end{align}

The algorithm used for the optimisation is a quantum-behaved particle swarm optimisation (QPSO) algorithm [Reference Sun, Feng and Xu15]. The QPSO method is based on the same principles as the regular particle swarm [Reference Eberhart and Kennedy17] but the dynamic of each particle is changed from the classic formulation to one where every particle is treated in a quantum manner [Reference Sun, Feng and Xu15]. The PSO algorithm is motivated by the flocking behaviour of flights of birds in search of food. Each particle (bird) $k$ in the population (flock) is represented by a vector of its position and velocity ( ${w_k};\;{\dot w_k}$ ); the update rules for a generation are:

(13)

\begin{align}{\dot w_k} &= {\dot w_k} + {s_1}{H_1}\!\left( {w_k^* - {w_k}} \right) + {s_2}{H_2}\!\left( {{w^*} - {w_k}} \right)\nonumber \\[5pt] {w_k} &= {w_k} + {\dot w_k}\end{align}

where $w_k^*$ is the best position experienced so far by the particle $k$ , measured in terms of the cost function (10), and ${w^*}$ is the best position of any particle. ${s_1}$ and ${s_2}$ are hyper-parameters and ${H_1}$ and ${H_2}$ are randomly generated. In the quantum version of the PSO algorithm, the trajectory of the particles is dependent on, (i) an attractor that is a random combination of the global best and previous best position for each particle, and (ii) a term relating to the particles’ potential fields.

Figure 6 shows the mean cost function value (red dashed line) across 10 particles and the best value (black solid line) during the convergence of the QPSO algorithm. The graph shows that all particles converge to an optimal cost function value after about 17 iterations.

Figure 6. QPSO convergence. Mean value (red dashed line) across 10 particles and best value (black solid line) as a function of the number of iterations.

After the parameters have been optimised, the NMSE has been calculated for all 96 experiments and plotted in Fig. 11 with the blue bars. The plot shows that NMSE is below 9%. Figure 12 shows the measured time histories (blue line) and the output of the white box for experiments 1, 61 and 84 as an example. A comparison of the time histories highlights that the white box is able to capture the overall damper behaviour; however, the prediction can be further improved by using a black box.

3.2 Black box model

The black box is implemented in the nonlinear autoregressive exogenous (NARX) framework, which can be described in the general regression form as:

(14)

\begin{align}{\hat f_i} = {{\Psi }}\!\left( {{{\hat f}_{i - 1}}, \ldots ,{{\hat f}_{i - {n_y}}},\hat f_{i - 1}^w, \ldots ,\hat f_{i - {n_x}}^w,{x_{i - 1}}, \ldots ,{x_{i - {n_x}}},{{\dot x}_{i - 1}}, \ldots ,{{\dot x}_{i - {n_x}}}} \right)\end{align}

i.e. regress the current system output on a range of past system inputs (displacement, $x$ , velocity $\dot x$ , the output of the white box ${\hat f^w}$ ). The function ${{\Psi }}$ represents a numerical model, which in this paper has been derived using two different methods: Gaussian process (GP) and neural network (NN). When fitting a NN regression model is good practice to divide the data into training, validation and test sets. In the first step, the training dataset is used to fit the model parameters. In the second stage, the hyper-parameters are tuned on the validation data set, which provides an unbiased evaluation of the model fit. In this case, the hyper-parameters are the number of past input and output samples ( ${n_x}$ in Equation (14)) plus some other hyper-parameters specific to the particular method used as will be discussed in the next two subsections. It has been chosen not to feedback the previous outputs $y$ ( ${n_y} = 0$ in Equation (14)) to the black box as this may cause the accumulation of the prediction error at each step leading to a divergent output. Finally, the accuracy of the model is evaluated on the test dataset. The train data set, in this case, is composed of about 40% of randomly selected experiments from the 96 available (marked in bold on the x-axis in Fig. 11). Of the selected experiments, only 50% of the samples have been included in the training set. The number of training samples is $1.47 \times {10^4}$ . The validation set is made up of 20% of randomly selected samples from the remaining experiments. The test data set includes all other samples not part of the previous two groups. In the case of GP the Bayesian framework is robust against overfitting therefore only the hyper-parameters have to be learned from the validation set.

3.2.1 Gaussian process

The basic concept of a Gaussian process (GP) is to perform inference over functions directly, instead of inference over parameters of a single function as it is done in a neural network regression, for example. A GP is a distribution over functions, which is conditioned on training data so that the most probable functions are the best fit for the data [Reference Bishop18].

Let $\boldsymbol{X}{{\;}} = {{\;}}{\left[ {{\boldsymbol{x}_{\textbf{{1}}}},{{\;}}{\boldsymbol{x}_{\textbf{2}}}} \ldots {{\;}}{\boldsymbol{x}_{\boldsymbol{N}}} \right]^{\boldsymbol{T}}}$ denotes a matrix of multivariate training inputs where the $i$ -th row is ${\boldsymbol{x}_i} = \left[ {\hat f_{i - 1}^w, \ldots ,\hat f_{i - {n_x}}^w,{x_{i - 1}}, \ldots ,{x_{i - {n_x}}},{{\dot x}_{i - 1}}, \ldots ,{{\dot x}_{i - {n_x}}}} \right]$ in this case and $\boldsymbol{f}$ denote the corresponding output vector of training measured damper forces. The input vector for a testing point will be denoted by the column vector ${\boldsymbol{x}^{{\ast}}}$ and the corresponding unknown output, which we aim to predict, is denoted by ${f^{{\ast}}}$ . It is possible to place a Gaussian process prior over the nonlinear function in other words; the parametric function that fits the data is drawn from the Gaussian process defined as follows:

(15)

\begin{align}f\!\left( \boldsymbol{x} \right) = \mathcal{G}\mathcal{P}\!\left( {\mu \!\left( \boldsymbol{x} \right),\boldsymbol\kappa \!\left( {\boldsymbol{x},\boldsymbol{x}'} \right)} \right)\end{align}

Where $\mu \!\left( \boldsymbol{x} \right)$ is the mean function and $\boldsymbol\kappa \!\left( {\boldsymbol{x},\boldsymbol{x}'} \right)$ is the positive defined covariance matrix.

This GP will now generate many smooth/wiggly functions, and if the function to be predicted falls into this family of functions that GP generates, this is now a sensible way to perform nonlinear regression. The shape of the generated function can be set by choosing the covariance function used to generate the covariance matrix $\boldsymbol\kappa $ , which in this case is chosen to be:

(16)

\begin{align}\boldsymbol\kappa \!\left( {{\boldsymbol{x}_{\boldsymbol{m}}},{\boldsymbol{x}_{\boldsymbol{n}}}} \right) = {\theta ^2}{\textrm{exp}}\!\left( {\frac{1}{{2l_1^2}}{{\!\left( {{\boldsymbol{x}_{\boldsymbol{n}}} - {\boldsymbol{x}_{\boldsymbol{m}}}} \right)}^2}} \right)\end{align}

where $\theta $ and ${l_1}$ are two hyper-parameters to be learnt from the training data.

One of the defining properties of the GP is that the joint probability of two finite sets of data such as the training set and the test set, is Gaussian multivariate. This can be used to condition the test set (unobserved) to the training set (observed) effectively fitting the GP model. Following the Bayesian approach and assuming the prior mean to be zero and a Gaussian noise model on the observation of variance ${\sigma ^2}$ , the joint probability of training and test data can be written as:

(17)

\begin{align}\boldsymbol\kappa \!\left( {f,{f^*}} \right) = \mathcal{N}\!\left( {0,\left[ {\begin{array}{l@{\quad}l}{{\textbf{K}}\!\left( \boldsymbol{X},\boldsymbol{X} \right) + {\sigma ^2}\boldsymbol{I}} & {{\textbf{K}}\!\left( {\boldsymbol{X},{\boldsymbol{x}^*}} \right)}\\[5pt] {\boldsymbol{K}\!\left( {{\boldsymbol{x}^*},\boldsymbol{X}} \right)} & {k\!\left( {{\boldsymbol{x}^*},{\boldsymbol{x}^*}} \right) + {\sigma ^2}}\end{array}} \right]} \right)\end{align}

Where ${\textbf{K}}\!\left(\boldsymbol{X},\boldsymbol{X} \right)$ is the covariance matrix whose $\!\left( {i,j} \right)$ element is given by $\kappa \!\left( \boldsymbol{x}_{\boldsymbol{i}},\boldsymbol{x}_{\boldsymbol{j}} \right)$ , $\boldsymbol{I}$ is the identity matrix, ${\textbf{K}}\!\left( \boldsymbol{X},{\boldsymbol{x}^*} \right)$ is a column vector whose $i$ element is given by ${\textbf{K}}\!\left( \boldsymbol{x}_{\boldsymbol{i}},{\boldsymbol{x}^*} \right)$ and ${\textbf{K}}\!\left( \boldsymbol{x}^*,\boldsymbol{X} \right)$ is its transpose.

To use the above results, it is necessary to derive the probability of the test data conditioned on the train data which can be written as:

(18)

\begin{align}p\!\left( {{f^*}{{|}}\boldsymbol{f}} \right) = \frac{{p\!\left( {\boldsymbol{f},{f^*}} \right)}}{{p\!\left( \boldsymbol{f} \right)}} = \mathcal{N}\!\left( {\mu \!\left( {{\boldsymbol{x}^*}} \right)\!,\kappa \!\left( {{\boldsymbol{x}^*},{\boldsymbol{x}^*}} \right)} \right)\end{align}

Where $m\!\left( {{\boldsymbol{x}^*}} \right)$ and $\kappa \!\left( {{\boldsymbol{x}^*},{\boldsymbol{x}^*}} \right)$ are the posterior mean and posterior variance given by:

(19)

\begin{align}\mu \!\left( {{\boldsymbol{x}^*}} \right){{\;}} = {\textbf{K}}\!\left( {{\boldsymbol{x}^*},\boldsymbol{X}} \right){\!\left( {{\textbf{K}}\!\left( \boldsymbol{X},\boldsymbol{X} \right) + {\sigma ^2}\boldsymbol{I}} \right)^{ - 1}}\boldsymbol{f}\end{align}

And

(20)

\begin{align}\kappa \!\left( {{\boldsymbol{x}^*},{\boldsymbol{x}^*}} \right) = \kappa \!\left( {{\boldsymbol{x}^*},{\boldsymbol{x}^*}} \right) + {\sigma ^2} - {\textbf{K}}\!\left( {{\boldsymbol{x}^*},\boldsymbol{X}} \right){\!\left( {{\textbf{K}}\!\left( \boldsymbol{X},\boldsymbol{X} \right) + {\sigma ^2}\boldsymbol{I}} \right)^{ - 1}}{\textbf{K}}\!\left( \boldsymbol{X},\boldsymbol{x}^{\ast} \right)\end{align}

Equation (19) can be used to calculate the most probable prediction of the unknown damper force ${f^*}$ given the training set while the confidence of the prediction is expressed by the variance in Equation (20). One of the main limitations of Gaussian processes is their scalability to large datasets due to the inversion of the covariance matrix in Equation (19). The stochastic variational Gaussian process (SVGP) has been designed to overcome this limit [Reference Hensman, Fusi and Lawrence19] and scale the GP method to larger data sets. The main idea behind SVGP is to approximate the true GP posterior with a GP conditioned on a smaller inducing test data set. This smaller set of $M$ samples is representative of the full dataset. The benefit of using SVGP is that the complexity of the model reduces from $\;\mathcal{O}\!\left( {{N^3}} \right)$ where $N$ is number of training samples for the classical GP to $\mathcal{O}\!\left( {{M^3}} \right)$ . The accuracy of the method improves as $M$ gets larger at the expense of the computational effort.

The SVGP fitting was carried out using GPflow python package by maximising the evidence lower bound (ELBO) using Adam optimiser [Reference Matthews, Van Der Wilk, Nickson, Fujii, Boukouvalas, León-Villagrá, Ghahramani and Hensman20]. The number of inducing values was set to $M = 200$ to compromise between computational effort and NMSE achieved on the validation data set.

The size of the mini-batch has been set to 1,000 samples by compromising the discrepancy between the ELBO calculated on the full data set and the average ELBO calculated across all the mini-batches and the computational time\memory. The top plot in Fig. 7 shows the computational time taken to calculate the ELBO estimate for 20 mini-batches versus the mini-batch size. The graph shows that as the mini-batch size increases the computation time increases. The bottom plot of Fig. 7 shows the ELBO estimate versus the mini-batch size calculated 20 times. The plot shows that as the dimension of the mini-batch increases the ELBO estimate converges to the ELBO value calculated across the entire data set (ground truth).

Figure 7. Computational time in seconds taken to estimate the ELBO of all mini-batch (top) and the ELBO estimate for each mini-batch vs the mini-batch size (bottom).

The size of the mini-batch has been set to 1,000 as for mini-batch size greater than 1,000 the spread of the ELBO estimate sharply decreases as shown in Fig. 7 and is a good compromise for the computational time.

Figure 8 shows the histogram of the ELBO estimate calculated across 400 mini-batches of size 1,000. Also shown is the ground truth (black line) and the mean ELBO estimate indicating that a good ELBO estimate can be achieved with a mini-batch dimension of 1,000. Finally, Fig. 9 shows the estimated ELBO during the convergence of the SVGP optimisation with an inducing value of 200 and a mini-batch size 1,000.

Figure 8. Histogram of ELBO estimate calculated across 400 mini-batches of size 1,000 samples.

Figure 9. Convergence of SVGP.

3.2.2 Neural network

Neural networks (NN) are commonly used regression models that use parametric forms of base nonlinear functions whose parameter values are optimised during training. There are different types of neural networks, ranging from feed-forward NN to more complex types such as recurrent, convolutional and others. The choice of the NN type depends on the application under consideration. In this study, because the white box provides a well-correlated signal to the measured damper force, it has been observed that a relatively simple architecture, such as feedforward NN, is able to give satisfactory accuracy. The NN architecture (number of neurons and hidden layers) was optimised using the validation data set. A network with a small number of neurons and hidden layers is, in general, not able to model complex nonlinear behaviours and it would result in a high NMSE on the training data set (high bias). On the opposite, a very high number of neurons and hidden layers would over fit the training data but it will not generalise on the test data set (high variance) [Reference Bishop18]. The NN used for this application has one input layer of 128 neurons, three fully connected hidden layers of 64, 32 and 16 neurons, respectively, and a single output neuron. The activation function used is the ReLU function. The results relative to the architecture optimisation are not reported here for brevity. The NN training has been performed using Python Keras [Reference Chollet21].

3.2.3 Black box fitting

To find the optimal number of input lags, the model has been trained on the training data set and the NMSE has been calculated on the validation set. Figure 10 shows the NMSE for the training and validation set as a function of the input lag number for the GP (left plot) and NN (right plot) regressions. In both cases, an increase in the number of lags reduces the NMSE.

Figure 10. NMSE calculated on the train and validation set as a function of the number of input lags for GP (left plot) and NN (right plot).

Figure 11. NMSE (%) for each experiment obtained with the white box (blue), grey box using GP (orange) and NN (green). Training set experiments are marked in bold.

Figure 12. Comparisons damper force time histories of experiment 1 (top plot), 61 (centre plot) and 84 (bottom plot). White box (blue line), grey box using GP (orange line), grey box NN (green line), and measured (red line).

Figure 13. NMSE (%) for each experiment with random input obtained with the white box (blue), grey using NN (green). The model has been trained on multi-harmonic displacement of Section 2.

Figure 14. Comparisons damper force time histories of random input experiment 0 (top plot), 4 (centre plot) and 8 (bottom plot). White box (blue line), grey box using NN (green line) and measured (red line).

The small discrepancy of the NMSE calculated on the training and validation data sets indicates that the regression model is able to generalise the prediction on unseen data and therefore is free of overfitting. The number of lags has been fixed to 15 since, as shown in Fig. 10, after an initial steep reduction the NMSE curve flattens out. Keeping the number of lags as low as possible reduces the training computational effort.

4. Results

After the hyper-parameters have been optimised and the model has been trained, the NMSE has been calculated for each of the 96 experiments.

Figure 11 shows that the NMSE improves the white box prediction bringing the NMSE below 3%. In particular, the NMSE values obtained for the GP (orange bars) and NN (green bars) are comparable. The NN seems to slightly outperform the GP regression and has the advantage to be computationally less demanding.

To better appreciate the improvement obtained with the black box, the damper force prediction obtained with the GP (orange line) and NN (green line) regressions have been plotted in Fig. 12 for three selected experiments. As discussed in Section 3, also the white box prediction (blue line) and the measured force (red line) are plotted for comparison. The plot shows that the use of the black box improves accuracy by predicting the dynamics not captured by the white box model.

4.1 Random input displacement test

To further validate the model, the damper has been tested by imposing a random displacement input of 30 seconds with a cut-off frequency of 20Hz.

Nine tests have been carried out with increasing root mean squared (rms) values of the input signal to cover the full stroke of the damper. Figure 13 shows the NMSE obtained either with white box (blue bars) or grey box (green bars) models. The plot shows that the NMSE of the grey box is below 5%. This result shows that the model gives a good estimate of the damper force also for random input displacements not used in the model fitting (white box parameter optimisation process and black box training), which was done based on harmonic displacements only.

The accuracy of the model improves as the input displacement rms gets larger. This is due to the fact the majority of the tests used for the model fitting were carried out at high displacement amplitude which caused the opening of the relief valves. It should be also mentioned that the test rig showed some limitations when imposing low amplitude displacements due to the limits imposed by the hydraulic actuator of the test rig and the presence of friction phenomena.

Figure 14 shows 3 seconds of the time history measured force (red line), the white box (blue line) and the grey box prediction (green line) for comparison.

5. Conclusions

This paper shows the identification of a time domain dynamic model of a helicopter main rotor lead-lag damper. The lead-lag damper under consideration uses the peculiar characteristics associated to the action of a high-viscosity fluid and the deformation of an elastomeric material, which makes the damper dynamics inherently nonlinear.

The modelling approach used is the so-called grey box model, i.e. a combination of a physical model (white box) and a numerical regression model (black box). The white box, consisting of a set of differential equations, describes the core dynamic of the damper. A QPSO algorithm is used to optimise the parameters of the white box based on experimental test data obtaining an NMSE below 9%. A black box, consisting of a general machine learning approximation approach, is then used to improve the overall model accuracy capturing the complex dynamics not described in the white box. Two regression methods for the fitting of the black box are compared: Gaussian process and neural network. The two methods give similar NMSE however, the NN is the preferred option for its lower computational effort.

The grey box predicts the damper force with a NMSE below 5% for both multi-harmonic and random input displacements.

References

Johnson, W. Rotorcraft Aeromechanics, Cambridge University Press, 2013, Cambridge, UK.CrossRef Google Scholar

Quaranta, G., Muscarello, V. and Masarati, P. Lead-lag damper robustness analysis for helicopter ground resonance, J. Guid. Control Dyn., 2013, 36, (4), pp 1150–1161.CrossRef Google Scholar

Titurus, B. and Lieven, N. Integration of hydraulic lag-damper models with helicopter rotor simulations, J. Guid. Control Dyn., 2010, 33, (1), pp 200–211.CrossRef Google Scholar

Wallace, M.I., Wagg, D.J., Neild, S.A., Bunniss, P., Lieven, N.A.J. and Crewe, A.J. Testing coupled rotor blade–lag damper vibration using real-time dynamic substructuring, J. Sound Vib., 2007, 307, (3), pp 737–754.CrossRef Google Scholar

McGuire, D.P. Fluidlastic dampers and isolators for vibration control in helicopters, Presented at the AHS International Forum 50, May 1994.Google Scholar

Panda, B., Mychalowycz, E. and Tarzanin, F.J. Application of passive dampers to modern helicopters, Smart Mater. Struct., 1996, 5, (5), pp 509–516.CrossRef Google Scholar

Kunz, D.L. Elastomer modelling for use in predicting helicopter lag damper behavior, J. Sound Vib., 1999, 226, (3), pp 585–594.CrossRef Google Scholar

Jones, P.J., Russell, D.D. and McGuire, D.P. Latest developments in fluidlastic^® lead-lag dampers for vibration control in helicopters, Presented at the AHS 59th Annual Forum Proceedings, Phoenix, Arizona, May 2003.Google Scholar

Smith, E.C., Lesieutre, G.A., Szefi, J.T. and Marr, C. Time domain fluidlastic^® lag damper modeling, Presented at the AHS International Forum 62, May 2006.Google Scholar

Felker, F.F., Lau, B.H., Mclaughlin, S. and Johnson, W. Nonlinear behavior of an elastomeric lag damper undergoing dual-frequency motion and its effect on rotor dynamics, J. Am. Helicopter Soc., 1987, 32, (4), pp 45–53.Google Scholar

Eyres, R.D., Champneys, A.R. and Lieven, N.A.J. Modelling and dynamic response of a damper with relief valve, Nonlinear Dyn., 2005, 40, (2), pp 119–147.CrossRef Google Scholar

Worden, K., Hickey, D., Haroon, M. and Adams, D.E. Nonlinear system identification of automotive dampers: A time and frequency-domain analysis, Mech. Syst. Signal Process., 2009, 23, (1), pp 104–126.CrossRef Google Scholar

Worden, K., Barthorpe, R.J., Cross, E.J., Dervilis, N., Holmes, G.R., Manson, G. and Rogers, T.J. On evolutionary system identification with applications to nonlinear benchmarks, Mech. Syst. Signal Process., 2018, 112, pp 194–232.10.1016/j.ymssp.2018.04.001CrossRef Google Scholar

Worden, K. and Tomlinson, G.R. Nonlinearity in Structural Dynamics: Detection, Identification and Modelling, CRC Press, 2019, Florida, US.Google Scholar

Sun, J., Feng, B. and Xu, W. Particle swarm optimization with particles having quantum behavior, Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No.04TH8753), Jun. 2004, pp. 325–331.Google Scholar

Kyprianou, A., Giacomin, J., Worden, K., Heidrich, M. and Bocking, J, Differential evolution based identification of automotive hydraulic engine mount model parameters, Proc. Inst. Mech. Eng. Part J. Automob. Eng., 2000, 214, (3), pp 249–264.CrossRef Google Scholar

Eberhart, R. and Kennedy, J. A new optimizer using particle swarm theory, MHS’95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Oct. 1995, pp 39–43.Google Scholar

Bishop, C. Pattern Recognition and Machine Learning, Springer-Verlag, 2006, New York, US.Google Scholar

Hensman, J., Fusi, N. and Lawrence, N.D. Gaussian processes for big data, Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013), arXiv:1309.6835, Jul. 2013.Google Scholar

Matthews, A.G.D.G., Van Der Wilk, M., Nickson, T., Fujii, K., Boukouvalas, A., León-Villagrá, P., Ghahramani, Z. and Hensman, J. GPflow: A Gaussian process library using TensorFlow, J. Mach. Learn. Res., 2017, 18, (40), pp 1–6.Google Scholar

Chollet, F., et al., Keras, https://keras.io , 2015.Google Scholar

Figure 1. Simplified scheme of the damper.

Figure 2. Measured displacement (top), velocity (mid) and force (bottom) acquired in an experimental test at a single frequency.

Figure 3. Force vs displacement (right) and force vs velocity (left) diagrams of the signal in Fig. 2.

Figure 5. Grey box model scheme.

Figure 6. QPSO convergence. Mean value (red dashed line) across 10 particles and best value (black solid line) as a function of the number of iterations.

Figure 7. Computational time in seconds taken to estimate the ELBO of all mini-batch (top) and the ELBO estimate for each mini-batch vs the mini-batch size (bottom).

Figure 8. Histogram of ELBO estimate calculated across 400 mini-batches of size 1,000 samples.

Figure 9. Convergence of SVGP.

Figure 10. NMSE calculated on the train and validation set as a function of the number of input lags for GP (left plot) and NN (right plot).

Figure 11. NMSE (%) for each experiment obtained with the white box (blue), grey box using GP (orange) and NN (green). Training set experiments are marked in bold.

Figure 13. NMSE (%) for each experiment with random input obtained with the white box (blue), grey using NN (green). The model has been trained on multi-harmonic displacement of Section 2.

Article contents

Damper model identification using mixed physical and machine-learning-based approach

Abstract

Keywords

Symbols

1.0 Introduction

2.0 Lead-lag damper tests

3.0 Grey box model

3.1 White box model

3.1.1 White box parameter optimisation

3.2 Black box model

3.2.1 Gaussian process

3.2.2 Neural network

3.2.3 Black box fitting

4. Results

4.1 Random input displacement test

5. Conclusions

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests