Design and stability analysis of semi-implicit cascaded proportional-derivative controller for underactuated cart-pole inverted pendulum system

Changyi Lei; Ruobing Li; Quanmin Zhu

doi:10.1017/S0263574723001352

Design and stability analysis of semi-implicit cascaded proportional-derivative controller for underactuated cart-pole inverted pendulum system

Published online by Cambridge University Press: 16 October 2023

Changyi Lei

Ruobing Li and

Quanmin Zhu

Show author details

Changyi Lei: Affiliation:
Department of Informatics, King’s College London, London, UK
Ruobing Li: Affiliation:
Department of Engineering, Design and Mathematics, University of the West of England, Bristol, UK
Quanmin Zhu*: Affiliation:
Department of Engineering, Design and Mathematics, University of the West of England, Bristol, UK
*: Corresponding author: Quanmin Zhu; Email: quan.zhu@uwe.ac.uk

Article contents

Abstract
Introduction
Preliminaries
Controller design and analysis
Simulation
Conclusion
References

Rights & Permissions

Abstract

This article proposes a control method for underactuated cartpole systems using semi-implicit cascaded proportional-derivative (PD) controller. The proposed controller is composed of two conventional PD controllers, which stabilizes the pole and the cart second-order dynamics respectively. The first PD controller is realized by transforming the pole dynamics into a virtual PD controller, with the coupling term exploited as the internal tracking target for the cart dynamics. Then, the second PD controller manipulates the cart dynamics to track that internal target. The solution to the internal tracking target relies on an equation set and features a semi-implicit process, which exploits the internal dynamics of the system. Besides, the design of second PD controller relies on the parameters of the first PD controller in a cascaded manner. A stability analysis approach based on Jacobian matrix is proposed and implemented for this fourth-order system. The proposed method is simple in design and intuitive to comprehend. The simulation results illustrate the superiority of proposed method compared with conventional double-loop PD controller in terms of convergence, with the theoretical conclusion of at least locally asymptotic stability.

Keywords

semi-implicit controller underactuated system cascaded PD controller nonlinear control cartpole system

Type: Research Article
Information: Robotica , Volume 42 , Issue 1 , January 2024 , pp. 87 - 117

DOI: https://doi.org/10.1017/S0263574723001352 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press

1. Introduction

Underactuated systems feature a class of systems whose state Degree-of-Freedom (DoF) is greater than its number of control inputs. This kind of systems are easily witnessed in a wide range in practice, for example, wheeled robots [Reference Chen, Liu, He, Qiao and Ji1], underwater vehicles [Reference Heshmati-alamdari, Nikou and Dimarogonas2], flexible robot systems [Reference Liu, Zhan, Xing, Wu, Xu and Wu3], etc. One critical advantage of underactuated systems over fully and over-actuated systems is that they require less cost and have less complexity due to lack of control inputs. Nonetheless, due to the exact same reasons, the control problem of underactuated systems has been a heated research direction. Among all underactuated systems, cartpole system has been a classic benchmark model that absorbs uncertainty, coupling, nonlinearity, non-minimum phase, multivariable and instability, which encompasses a majority of other underactuated systems. Therefore, the research into cartpole system has a fundamental significance to gain insight into other system dynamics [Reference Messikh, Guechi and Blai4].

Over the decades, various methods have been constantly proposed for the stabilization of cartpole system. Some early work tried to linearize the nonlinear model of cartpole near the equilibrium location and then implemented linear controllers. This simplification usually ensures stability near the equilibrium. Well-known examples include Proportional-Integral-Derivative (PID) controller and Linear-Quadratic-Regulator [Reference Banerjee and Pal5–Reference Eizadiyan and Naseriyan7]. Nonetheless, the linearization procedure impairs the accuracy of the dynamics and therefore cannot achieve large-scale stability [Reference Slotine and Li8]. Backstepping has been one of the most widely researched method for cartpole system control problem. Shao et al. adopted a state-feedback-based backstepping controller for the tracking and switching control of cartpole systems [Reference Shao and Li9]. Targeting at underactuated systems, Jiang et al. proposed an underactuated backstepping method for a class of underactuated systems [Reference Jiangand and Astolfi10]. Compared with conventional backstepping, this method has a systematic solution to a class of systems. However, the tuning and selection of those control matrices still remain an open question. Adaptive and robust control methods also received public attention. An adaptive optimal fuzzy controller based on feedback linearization and sliding mode control was proposed in ref. [Reference Lakmesari, Mahmoodabadi and Ibrahim11] for cartpole systems. Fuzzy logic system and gradient descent were combined to tune the parameters and a multi-object optimization algorithm was used to adjust sliding mode control gain. In ref. [Reference Dao and Liu12], an adaptive output-feedback optimal control was combined with integral sliding mode control for wheeled inverted pendulum under disturbance. The integral sliding mode controller was responsible for finite-time convergence, and adaptive dynamic programming was to deal with coupled uncertainties. In ref. [Reference Ordaz and Poznyak13], an adaptive control scheme is proposed based on Adaptive Ellipsoid Method (AEM) to tune the gain matrices of the observer and controller. The experiments on an underactuated vertical double pendulum with uncertainties illustrate the superiority over conventional AEM controller. There are many other control methods implemented case-by-case to cartpole systems, including energy-based [Reference Kennedy, King and Tran14], state-feedback controller [Reference Ranasinghe, Manoharan, Pallegedara and Kodithuwakku15], neuro network [Reference Ratolikar and Kumar16], and so on. Besides, more other methods were targeting at other underactuated systems which could easily be extended to cartpole system, for example, event-triggered dynamic surface control [Reference Peng, Jiang and Wang17], fast terminal sliding mode control [Reference Rojsiraphisal, Mobayen, Asad, Vu, Chang and Puangmalai18], etc. Nonetheless, most of the above-mentioned methods are tedious in design, which hinders their application in industry.

One big class of methods that are emergent recently is learning-based method. The common feature of this class of approaches lies in using huge amount of data to train a specific controller and optimizes a designed objective function. A milestone research in this direction was conducted by Google DeepMind, which used deep Q-learning to complete the control task of cartpole [Reference Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver and Wierstra19]. A forward neural network was implemented to approximate the Q-values of state-action pairs. Shi et al. combined type-1 Fuzzy Logic System (FLS) with Reinforcement Learning to achieve robust cartpole control [Reference Shi, Lam, Xuan and Chen20]. The FLS was implemented as an encoder to cope with the uncertainty of the system, and RL was to find optimal policy that minimizes the tracking error. Hiremath et al. applied a deep neural network-based gated-recurrent-units (GRUs) method for the stabilization and tracking problem of constrained stochastic cartpole system [Reference Hiremath and Bajçinca21]. Nevertheless, this class of methods usually require too much data to train the model. Besides, the black-box models trained make the inner dynamics intractable.

Therefore, borrowing ideas from conventional PID controller, this paper proposes a control method for underactuated cartpole system. The advantages of conventional PID controller are simple to design and intuitive to understand. Borrowing ideas from backstepping method, the work here extends the conventional PID controller to a cascaded version, which helps establish internal control targets. This manipulation increases the order of controller. In this way, the merits of PID controller can be maintained, while it can be implemented directly without linearization procedures. Contrary to widely accepted backstepping approach, the proposed method does not suffer from exploding terms or complex coordinate transformation technique. What is more, while most of the analysis of previous PID research lies in using linearization and transfer function, this paper proposes to implement Jacobian matrix-based stability analysis, which is applicable to any differentiable nonlinear system dynamics. The contributions of this article are summarized as follows:

• Propose a unique cascaded PD controller, which transforms the pole dynamics into a virtual PD controller and using the coupling term as the design variable for the second PD controller design. This model-based method absorbs the simplicity and intuitiveness of a conventional PID controller, but exploits the system dynamics in the meantime.
• Introduce a stability analysis method for the fourth-order cascaded PD controller using the Jacobian matrix of the residual system, although it concludes only locally asymptotic stability. This presents a novel way to approach stability analysis in this context, with the potential to be used for parameter design.
• Achieve automation of complex derivations and design processes through symbolic calculations on a PC, allowing for more efficient design and validation.
• This paper includes a comprehensive analysis on both the linearized and original nonlinear models, providing a thorough examination of the proposed method’s applicability.
• Simulation results reveal the proposed method’s advantages in stabilizing both the cart and the pole simultaneously, showing superior performance over widely used double-loop PD controllers. The robustness against Coulomb friction and random noise is also demonstrated.

The rest of the paper is organized as follows. In Section 2, some background rationale is introduced. Firstly, the dynamic model of the cartpole system is given, both linear and nonlinear. Secondly, conventional PD controller is presented, which serves as the basis of the proposed method. Section 3 articulates the design process of the proposed method. The overall framework and workflow are foremost described. Then, the design process for linear and nonlinear dynamics is presented, followed with stability analysis procedures. Section 4 illustrates the results in simulation. The system responses are depicted, and further analysis is carried out using Jacobian matrix. Section 5 concludes the article and points out many potent further research directions.

2. Preliminaries

2.1. Dynamic model description

This section introduces the structure as well as dynamics of the cartpole system to be investigated later in this research. Both nonlinear and linear version of the dynamics will be presented, and the controller design is to be carried out on both. The inclusion of linear version model is to present the method more clearly, as the nonlinear model of cartpole system is so complicated that the readers may be distracted from the mathematics instead of the workflow of the proposed control method.

Figure 1 illustrates the conceptual structure of the cartpole system. The system, as the name suggests, is composed of a cart, to which a pole is connected on top of it. The goal of control in this regard is to keep the pole upstraight for as long as possible. In the meantime, it is reasonably required to reduce the movement of the cart during the process. The (nonlinear) dynamics of the system can be expressed as follows [Reference Lam and Leung22]. In Eq. (1) and Fig. 1, $u$ is the control force (N), $x_1,x_2$ are the angular position and angular velocity of the pole $(\text{rad}, \text{rad/s})$ , $x_3,x_4$ are the position and linear velocity of the cart $(\text{m, m/s})$ . l is the length of the pole (m), $M_1$ is the mass of the cart (kg), J is the moment of inertia $(\text{kg m}^2)$ , and $M_2$ is the mass of the pole (kg). Besides, $F_0,F_1$ are the friction factor of the cart and the pole respectively $(\text{N/m/s})$ . g is the gravity coefficient.

Figure 1. Conceptual structure of cartpole system.

(1)

\begin{equation} \begin{cases} \dot{x}_1 = x_2 \\ \dot{x}_2= \dfrac{\left ( \begin{array}{r} -F_1(M_1+M_2)x_2-M_2^2l^2x_2^2 \sin (x_1) \cos (x_1)+F_0M_2lx_4 \cos (x_1)\\ +(M_1+M_2)M_2gl \sin (x_1)-M_2l \cos (x_1)u \end{array}\right )}{p_2}\\ \dot{x}_3 = x_4 \\ \dot{x}_4 = \dfrac{\left ( \begin{array}{r} F_1M_2lx_2 \cos (x_1)+(J+M_2l^2)M_2lx_2^2 \sin x_1 -F_0(J+M_2l^2)x_4\\ -M_2^2gl^2 \sin (x_1) \cos (x_1) +(J+M_2l^2)u \end{array} \right )}{p_2}\\ \end{cases} \end{equation}

To simplify the controller design, the model is frequently linearized near the equilibrium location, namely when $x_1 \approx 0, x_3 \approx 0$ . Therefore, the following approximations hold:

(2)

\begin{align} \sin (x_1)& \approx x_1 \end{align}

(3)

\begin{align} \cos (x_1) &\approx 1 \end{align}

(4)

\begin{align} x_2^2 \approx x_4^2 & \approx 0 \end{align}

Integrating with Eq. (1), a linearized model of cartpole is derived:

(5)

\begin{equation} \begin{cases} \dot{x}_1 = x_2 \\ \dot{x}_2 = Ax_1 + Bx_2 + Cx_4 + Du \\ \dot{x}_3 = x_4 \\ \dot{x}_4 = Ex_1 + Fx_2 + Gx_4 + Hu \\ \end{cases} \end{equation}

where

(6)

\begin{align} A & = \frac{(M_1+M_2)M_2gl}{J(M_1+M_2)+M_1M_2l^2}\end{align}

(7)

\begin{align} B & = \frac{-F_1(M_1+M_2)}{J(M_1+M_2)+M_1M_2l^2}\end{align}

(8)

\begin{align} C & = \frac{F_0M_2l}{J(M_1+M_2)+M_1M_2l^2}\end{align}

(9)

\begin{align} D & = -\frac{M_2l}{J(M_1+M_2)+M_1M_2l^2}\end{align}

(10)

\begin{align} E & = \frac{-M_2^2gl^2}{J(M_1+M_2)+M_1M_2l^2}\end{align}

(11)

\begin{align} F & = \frac{F_1M_2l}{J(M_1+M_2)+M_1M_2l^2}\end{align}

(12)

\begin{align} G & = \frac{-F_0(J+M_2l^2)}{J(M_1+M_2)+M_1M_2l^2}\end{align}

(13)

\begin{align} H & = \frac{J+M_2l^2}{J(M_1+M_2)+M_1M_2l^2}\end{align}

It is conceivable that the cartpole system is a highly nonlinear system. High order of trigonometric functions appear both in the denominators and the numerators. Besides, the coupling effect is remarkable. $x_1,x_2,x_4$ and $u$ pose influence on the dynamics of both the cart and the pole. Therefore, the control problem of cartpole system is challenging and has some fundamental influence in control realm.

2.2. Conventional PD controller

Figure 2 is the conceptual structure of a conventional PD controller. $R$ is the reference signal, $e(t)$ is the error in time $t$ , $U$ is the control signal and $Y$ is the output. The core modules of PD controller are proportional and derivative module, and the mathematical expression is [Reference Chen23]

(14)

\begin{equation} U=K_pe(t)+k_d\frac{de(t)}{dt}, \end{equation}

where $k_p,k_d$ are the proportional gain and derivative gain. PD controller is a simplified version of PID controller, which is widely adopted in the industry [Reference Tomei24].

Figure 2. Conceptual structure of conventional PD controller.

3. Controller design and analysis

3.1. Controllability analysis around equilibrium point

In this section, the controllability of the system near the unstable equilibrium point is analysed, which serves as the foundation for controller design. Consider system dynamics (5) and write down the system matrices as follows:

(15)

\begin{align} Y & = \left [ \begin{array}{cccccccccc} 0 & & & 1 & && 0 & && 0 \\[3pt] A & && B & && 0 & && C \\[3pt] 0 & && 0 & && 0 & && 1 \\[3pt] E & && F & && 0 & && G \end{array} \right ] \end{align}

(16)

\begin{align} Z & = [0,D,0,H]^T \end{align}

According to controllability theorem, if matrix $[Z,YZ,Y^2Z,Y^3Z]$ has full rank, then the system is controllable in the equilibrium point. Integrating system parameters in Table II, the following controllability matrix is verified to be full rank.

(17)

\begin{equation} \left [ \begin{array}{cccccccccc} 0&&&-4.55&&&0&&&1.82 \\[3pt] -4.55&&&24.93&&&1.82&&&-2.40 \\[3pt] 24.93&&&-275.04&&&-2.40&&&23.92 \\[3pt] -275.04&&&2246.79&&&23.92&&&-196.00 \end{array} \right ] \end{equation}

3.2. Framework overview

Figure 3 is the conceptual framework of proposed semi-implicit cascaded PD controller design. The original fourth-order system of cartpole is considered as two coupled second-order systems. This paper uses “subplant1” to denote the pole dynamics, namely $x_1,x_2$ , and “subplant2” to represent the cart dynamics, which is $x_3,x_4$ . One direct comprehension of the proposed method is to use one PD controller each for two subplants, respectively. It is expected that if both subplants can be stabilized separately, the overall system can be stable. Nonetheless, the coupling effect inside the model determines that a direct realization of such idea will yield unsatisfying performance. Besides, while the reference signal for the first PD controller, namely “PD1,” is given, the tracking target for the second PD controller, “PD2,” is not available and should be determined in some way. The proposed method solves the above-mentioned problems and makes cascaded PD controller feasible in this situation. It is achieved by (1) transforming the subplant1 dynamics into an equivalent virtual PD controller, considering the coupling term of subplant2 as a design variable, and then (2) the desired tracking target for subplant2 is derived in a semi-implicit manner through the coupling term, as well as feedback linearization design of PD2. Finally, a cascaded PD controller can be implemented with the coupling effect exploited and solved.

Figure 3. Framework of semi-implicit cascaded PD controller design.

The workflow of proposed controller is specified in the following. Firstly, the reference signal $X_{1d}$ is input into PD1, where the desired virtual torque for subplant1 is computed. The coupling term from subplant2 is transformed to a design variable. In this way, the coupling term represents the desired position for subplant2 $X_{4d}$ , which helps the dynamics of subplant1 to approximate a PD controller to stabilize subplant1. Up to now, the control $u$ is not calculated, so the desired position for subplant2 cannot be calculated explicitly, and require further information from subplant2. Focusing on subplant2 only, a PD controller with feedback linearization can be easily designed. Combining the expression of the controller for subplant2 and that from subplant1, an equation set should be solved, and the expressions for $u$ as well as $X_{4d}$ are derived. The $X_{4d}$ is then fed into PD2 to complete the control for subplant2. Notice that using Gaussian elimination method [Reference Higham25], $X_{4d}$ appears on both sides of the equations, thus representing a semi-implicit process, resembling that of semi-implicit Euler integration method [Reference Deng and Liu26]. Besides, PD1 is designed and utilized on top of PD2 during the control process, therefore forming a cascaded relationship.

3.3. Semi-explicit cascaded PD controller design for linear approximated model

This section illustrates the design process of the proposed method using the linearized version of dynamic model (5). The purpose of using a linearized simple model is to make the derivation process tractable, thus enabling a clearer presentation of the idea and rationale. The design process based on original nonlinear model (1) will be given in Section 3.5.

Based on Eq. (5), the dynamics for two subplants can be written explicitly as:

(18)

\begin{equation} \text{subplant1}: \begin{cases} \dot{x}_1 = x_2 \\ \dot{x}_2 = Ax_1 + Bx_2 + Cx_4 + Du \\ \end{cases} \end{equation}

(19)

\begin{equation} \text{subplant2}: \begin{cases} \dot{x}_3 = x_4 \\ \dot{x}_4 = Ex_1 + Fx_2 + Gx_4 + Hu \\ \end{cases} \end{equation}

In Eq. (18), the coupling term from subplant2 is $Cx_4$ , which will be used as design variable. Borrowing ideas from a serial integrator under the control of a PD controller, if the following equation always holds,

(20)

\begin{equation} Ax_1 + Bx_2 + Cx_4 + Du = k_{p1}e_1 + k_{d1}e_2, \end{equation}

then there must exist certain parameters $k_{p1}, k_{d1}$ that makes subplant1 stable. Here, $k_{p1}, k_{d1}$ are the proportional and derivative gains of PD1 controller, and $e_1=-x_1,e_2=-x_2$ are the angular and angular velocity errors of subplant1. With this assumption satisfied, Eq. (18) becomes a normal second-order system controller by a PD controller as follows:

(21)

\begin{equation} \text{subplant1}: \begin{cases} \dot{x}_1 = x_2 \\ \dot{x}_2 = k_{p1}e_1 + k_{d1}e_2 \\ \end{cases}. \end{equation}

Noticing the coupling term $x_4$ can be controlled by a first-order system in Eq. (19), Eq. (20) is rewritten as

(22)

\begin{equation} Ax_1 + Bx_2 + Cx_{4d} + Du = k_{p1}e_1 + k_{d1}e_2, \end{equation}

and that

(23)

\begin{equation} x_{4d}=\frac{k_{p1}e_1 + k_{d1}e_2 - Ax_1 - Bx_2 - Du}{C}, \end{equation}

where $x_{4d}$ defines the desired tracking target for subplant2. With the tracking target calculated, the attention shall be shifted to subplant2, where a PD2 controller with feedback linearization is designed as

(24)

\begin{equation} u=\frac{1}{H} ({-}Ex_1 - Fx_2 - Gx_4 + k_{p2}e_3 + k_{d2}e_4), \end{equation}

where $k_{p2}, k_{d2}$ are the proportional and derivative gains of PD2 controller, and $e_3,e_4$ are the angular and angular velocity errors of subplant2 delicately chosen as

(25)

\begin{equation} e_3 = x_{3d} - x_3 = (x_3+\Delta t \cdot x_{4d})-x_3 + x_r-x_3= \Delta t \cdot x_{4d} + x_r-x_3 \end{equation}

(26)

\begin{equation} e_4 = - x_4 \end{equation}

in which $x_r$ is user-defined constant target for the cart position. Notice that $x_{4d}$ is a velocity signal. In order to implement PD controller, it should be converted to position signal, hence the manipulation in Eq. (25), where $\Delta t$ represents sampling time. With that being done, the information of $x_{4d}$ is fully reflected in $e_3$ , enabling the assignment of Eq. (26). Frankly speaking, $x_{4d}$ is used to construct discrete position targets with reference velocity of the cart being $0$ . The subplant2 dynamics now becomes a second-order system controlled by PD2 controller with a reference signal related to $x_1,x_2$ :

(27)

\begin{equation} \text{subplant2}: \begin{cases} \dot{x}_3 = x_4 \\ \dot{x}_4 = k_{p2}(\Delta t \cdot x_{4d}+x_r-x_3)-k_{d2}x_4 \\ \end{cases}. \end{equation}

Up to now, neither $u$ nor $x_{4d}$ are explicitly expressed. Combining Eqs. (23) and (24) to eliminate $u$ arriving at the following

(28)

\begin{equation} x_{4d}=\frac{-k_{p1}x_1 - k_{d1}x_2 - Ax_1 - Bx_2 - \frac{D}{H} ({-}Ex_1 - Fx_2 - Gx_4 + k_{p2}\Delta t \cdot x_{4d} + k_{p2}(x_r-x_3)- k_{d2}x_4)}{C}. \end{equation}

Consequently, the expression for $x_{4d}$ is derived:

(29)

\begin{equation} x_{4d}=\frac{-(k_{p1}-\frac{DE}{H}+A)x_1 - (k_{d1}-\frac{DF}{H}+B)x_2 - \frac{D}{H}k_{p2}(x_r-x_3) + \frac{D( G+ k_{d2})}{H} x_4}{C+\frac{D}{H}k_{p2}\Delta t}. \end{equation}

Similarly, the full expression for $u$ is available by substituting Eq. (29) into Eq. (24).

Remark 1 (Joint stability): If subplant1 and subplant2 can be stabilized separately, the whole system would be expected to be stable. Indeed, it can be proved that system (21) and (27) can be stabilized separately [Reference Zhao and Guo27] under natural assumptions. However, a joint analysis is still required to ensure stability of the fourth-order system, which will be detailed in Sections 3.4 and 3.6.

Remark 2 (Cascaded PD controller): The cascaded PD controller in this paper is different from the conventional one. Conventional cascaded PD controller works in adjacent order of the system, for example, one PD controller to assign desired velocity and the other PD controller to control the acceleration [Reference Andrade, Guedes, Carvalho, Zachi, Haddad, Almeida, de Melo and Pinto28]. In contrast, the PD1 controller in this paper serves as the acceleration controller for subplant1, as well as the calculator of the reference signal for subplant2. And PD2 controller is the acceleration controller for subplant2.

Remark 3 (Semi-implicitness): The proposed cascaded PD controller is semi-implicit in two levels. The most shallow level is that in Eq. (29), $x_{4d}$ appears in both sides of the equations, which means using an unknown term to calculate the result of the same unknown term. It is similar to the manipulation in semi-implicit Euler method for numerical integration. Nonetheless, the source of this semi-implicitness is generated from two sources of information utilized for the derivation of $x_{4d}$ . The first is from transforming subplant1 into a virtual PD controller, and the second is from the PD controller design for subplant2. Both separate design processes finally point to the same target, $x_{4d}$ .

Remark 4 (Transforming PI to PD): In Eq. (26), the authors transform velocity target for $x_4$ to position target for $x_3$ using discrete approximation $x_{4d}\Delta t$ , so as to implement PD controller for subplant2. More frequently used controller for first-order system is Proportional-Integral (PI) controller. Nevertheless in this paper, for the consistency and simplicity of stability analysis, such a handling is chosen. The challenge for implementing PI controller instead is how to formulate a unified stability analysis together with PD controller.

3.4. Stability analysis using Jacobian matrix for linear approximated model

Following the controller design procedure of Section 3.3, this section presents the stability analysis method for the whole system. Although this is a linearized model, instead of using widely used transfer function, a Jacobian matrix-based method is implemented for wider applicability. The error vectors of the system are constructed, along with their dynamics. The Jacobian matrix is then calculated. By analysing the eigenvalues and eigenvectors of the Jacobian matrix, the stability of the system can be determined.

The error vectors for linear approximated model (5) is

(30)

\begin{equation} \begin{cases} e_1 = -x_1 \\ e_2 = -x_2 \\ e_3 = \Delta t \cdot x_{4d} = \Delta t \cdot \dfrac{-(k_{p1}+\frac{DE}{H}+A)x_1 - (k_{d1}+\frac{DF}{H}+B)x_2 + \frac{D}{H}k_{p2}(x_r-x_3) - \frac{D( G+ k_{d2})}{H} x_4}{C+\frac{D}{H}k_{p2}\Delta t} \\ e_4 = -x_4 \end{cases} \end{equation}

Differentiating (30) and replacing all state variables using $e_i,i=1,2,3,4$ :

(31)

\begin{equation} \begin{cases} \dot{e}_1 & = e_2 \\ \dot{e}_2 & = -\dot{x}_2 \\ & =-Ax_1-Bx_2-Cx_4-Du \\ & = \big(A-\frac{DE}{H}\big)e_1+\big(B-\frac{DF}{H}\big)e_2-\frac{Dk_{p2}}{H}e_3+\big(C-\frac{DG}{H}-\frac{Dk_{d2}}{H}\big)e_4\\ \dot{e}_3 &= \Delta t \cdot \dot{x}_{4d} + e_4\\ \dot{e}_4 & = -\dot{x}_4 \\ & = -Ex_1-Fx_2-Gx_4-Hu \\ & = -k_{p2}e_3-k_{d2}e_4\\ \end{cases} \end{equation}

In the following, the term $\dot{e}_3$ is managed separately due to its complexity in calculation. Rewriting Eq. (29) using $e_i,i=1,2,3,4$ :

(32)

\begin{equation} x_{4d}=\frac{\big(k_{p1}-\frac{DE}{H}+A\big)e_1 + \big(k_{d1}-\frac{DF}{H}+B\big)e_2 -\frac{D}{H}k_{p2}e_3 - \frac{D( G+ k_{d2})}{H} e_4}{C} \end{equation}

Differentiating $x_{4d}$ and substituting Eq. (31):

(33)

\begin{equation} \begin{cases} \dot{x}_{4d} &= \big [ \big(k_{p1}-\frac{DE}{H}+A\big)e_2 + \big(k_{d1}-\frac{DF}{H}+B\big)\dot{e}_2 - \frac{D( G+ k_{d2})}{H} \dot{e}_4 \big ]/ C \\[4pt] &=\big [ a_1e_1 + a_2e_2 +a_3e_3 + a_4e_4 \big ]/ \big (C+\frac{D}{H}k_{p2}\Delta t \big ) \\[4pt] a_1 &= \big(B-\frac{DF}{H}+k_{d1}\big)\big(A-\frac{DE}{H}\big) \\[4pt] a_2 &= \big(B-\frac{DF}{H}\big)\big(B-\frac{DF}{H}+k_{d1}\big)+A-\frac{DE}{H}+k_{p1} \\[4pt] a_3 &= \frac{D(G+k_{d2})}{H}k_{p2}+\big(k_{d1}-\frac{DF}{H}+B\big)\big({-}\frac{Dk_{p2}}{H}+C-\frac{DC}{H}-\frac{Dk_{d2}}{H}\big) \\[4pt] a_4 &= \frac{Dk_{d2}(G+k_{d2})}{H}-\frac{D}{H}k_{p2}\\ \end{cases} \end{equation}

Integrating back to Eq. (31) yields the final expression of $\dot{e}_3$ :

(34)

\begin{equation} \dot{e}_3=\Delta t\Big [ a_1e_1 + a_2e_2 +a_3e_3 + a_4e_4 \Big ]/ \Big (C+\frac{D}{H}k_{p2}\Delta t \Big ) \end{equation}

Denote the vector field of Eq. (31) as $F_{\text{linear}}(e_1,e_2,e_3,e_4)$ , that is,

(35)

\begin{equation} F_{\text{linear}}(e_1,e_2,e_3,e_4)= \left [ \begin{array}{c} e_2 \\[4pt] \big(A-\frac{DE}{H}\big)e_1+\big(B-\frac{DF}{H}\big)e_2-\frac{Dk_{p2}}{H}e_3+\big(C-\frac{DG}{H}-\frac{Dk_{d2}}{H}\big)e_4 \\[4pt] \Delta t\big [ a_1e_1 + a_2e_2 +a_3e_3 + a_4e_4 \big ]/ \big (C+\frac{D}{H}k_{p2}\Delta t \big ) + e_4 \\[4pt] -k_{p2}e_3-k_{d2}e_4 \\ \end{array} \right ] \end{equation}

Then, the Jacobian matrix can be derived easily, denoting $p=C+\frac{D}{H}k_{p2}\Delta t$ :

(36)

\begin{equation} DF_{\text{linear}}(e_1,e_2,e_3,e_4)= \left [ \begin{array}{cccccccccc} 0 &&& 1 &&& 0 &&& 0 \\[4pt] A-\frac{DE}{H} &&& B-\frac{DF}{H} &&& -\frac{Dk_{p2}}{H} &&& C-\frac{DG}{H}-\frac{Dk_{d2}}{H} \\[4pt] \Delta t a_1/ p &&& \Delta t a_2/ p &&& \Delta t a_3/ p &&& \Delta t a_4/ p + 1 \\[4pt] 0 &&& 0 &&& -k_{p2} &&& -k_{d2} \\ \end{array} \right ] \end{equation}

It is noticeable that $[e_1,e_2,e_3,e_4]=[0,0,0,0]$ is a fixed point of the system (46). Under most conditions, the system is deemed locally asymptotic stable around the fixed point if all the eigenvalues of Eq. (36) have negative real parts, which represents an exponentially decaying term in time domain, and thus means stability. However, the explicit solution to the eigenvalues of Eq. (36) is too complex, and therefore, it better serves as an examiner of the stability through numerical calculation.

3.5. Semi-explicit cascaded PD controller design for nonlinear model

Following the same design procedure in Section 3.3, the proposed method is implemented in nonlinear model (1) in this section. The derivation for nonlinear model of cartpole system is far more complicated than its linear counterpart. Therefore, only necessary derivation is completed by hand, for example, the derivatives of error vectors, and then, the authors use SymPy [Reference Meurer, Smith, Paprocki, Čertík, Kirpichev, Rocklin, Kumar, Ivanov, Moore, Singh, Rathnayake, Vig, Granger, Muller, Bonazzi, Gupta, Vats, Johansson, Pedregosa, Curry, Terrel, Roučka, Saboo, Fernando, Kulal, Cimrman and Scopatz29] to complete the final symbolic as well as numerical calculation.

Splitting Eq. (1) into two subplants as follows, denoting $p_2 = (M_1+M_2)(J+M_2l^2)-M_2^2l^2 \cos ^2(x_1)$ :

(37)

\begin{equation} \text{subplant1}: \begin{cases} \dot{x}_1 = x_2 \\ \dot{x}_2= \dfrac{\left ( \begin{array}{r} -F_1(M_1+M_2)x_2-M_2^2l^2x_2^2 \sin (x_1) \cos (x_1)+F_0M_2lx_4 \cos (x_1)\\ +(M_1+M_2)M_2gl \sin (x_1)-M_2l \cos (x_1)u \end{array} \right )}{p_2}\\ \end{cases} \end{equation}

(38)

\begin{equation} \text{subplant2}: \begin{cases} \dot{x}_3 = x_4 \\ \dot{x}_4= \dfrac{\left ( \begin{array}{r} F_1M_2lx_2 \cos (x_1)+(J+M_2l^2)M_2lx_2^2 \sin x_1 -F_0(J+M_2l^2)x_4\\ -M_2^2gl^2 \sin (x_1) \cos (x_1) +(J+M_2l^2)u \end{array} \right )}{p_2}\\ \end{cases} \end{equation}

By transforming the dynamics of subplant1 into a virtual PD controller, and assigning the coupling term $x_4$ as the reference signal for subplant2, the following can be derived

(39)

\begin{equation} k_{p1}e_1+k_{d1}e_2= \dfrac{\left ( \begin{array}{r} -F_1(M_1+M_2)x_2-M_2^2l^2x_2^2 \sin (x_1) \cos (x_1)+F_0M_2lx_{4d} \cos (x_1)\\ +(M_1+M_2)M_2gl \sin (x_1)-M_2l \cos (x_1)u \end{array}\right )}{p_2} \end{equation}

where $k_{p1},k_{d1}$ are the proportional and derivative gains of PD1 controller, and $e_1=-x_1,e_2=-x_2$ are the angular and angular velocity errors of the pole dynamics. If this equation holds, subplant1 is equivalent to Eq. (21). Thus, the expression of $x_{4d}$ can be initially given:

(40)

\begin{equation} x_{4d}= \dfrac{\left ( \begin{array}{r} p_2k_{p1}e_1+p_2k_{d1}e_2 + F_1(M_1+M_2)x_2+M_2^2l^2x_2^2 \sin (x_1) \cos (x_1)\\ - (M_1+M_2)M_2gl \sin (x_1) + M_2l \cos (x_1)u \end{array} \right )}{F_0M_2l\cos (x_1)} \end{equation}

Turning focus to subplant2 and design a PD2 controller with feedback linearization

(41)

\begin{equation} u= \dfrac{\left ( \begin{array}{r} -F_1M_2lx_2 \cos (x_1)-(J+M_2l^2)M_2lx_2^2 \sin x_1 +F_0(J+M_2l^2)x_4\\ +M_2^2gl^2 \sin (x_1) \cos (x_1)+p_2(k_{p2}e_3+k_{d2}e4)\end{array} \right )}{J+M_2l^2}, \end{equation}

where $e_3,e_4$ are the position and velocity errors of the cart, defined as

(42)

\begin{equation} e_3 = x_{3d} - x_3 = (x_3+\Delta t \cdot x_{4d})-x_3 +(x_r-x_3)= \Delta t \cdot x_{4d} +(x_r-x_3) \end{equation}

(43)

\begin{equation} e_4 = - x_4 \end{equation}

so that the original subplant2 becomes a second-order system manipulated by a PD controller as in Eq. (27). Substituting Eq. (41) into Eq. (40) and solving the semi-implicit equation of $x_{4d}$ , the closed-form expression becomes:

(44)

\begin{equation} x_{4d}= \dfrac{\left ( \begin{array}{r} p_2k_{p1}e_1+p_2k_{d1}e_2 + F_1(M_1+M_2)x_2- (M_1+M_2)M_2gl \sin (x_1) + \frac{M_2l \cos (x_1)}{J+M_2l^2} \big [ {-}F_1M_2lx_2 \cos (x_1)\\ +F_0(J+M_2l^2)x_4+ M_2^2gl^2 \sin (x_1) \cos (x_1)+p_2k_{d2}e4 + p_2k_{p2}(x_r-x_3) \big ] \end{array} \right )}{F_0M_2l\cos (x_1)-\frac{p_2M_2l \cos x_1}{J+M_2l^2}k_{p2}\Delta t} \end{equation}

The explicit expression for $u$ can be herein obtained by substituting Eq. (44) with Eq. (42), Eq. (43) back into Eq. (41).

3.6. Stability analysis using Jacobian matrix for nonlinear model

This section provides the stability analysis for system and controller designed in Section 3.5. Compared with the stability analysis of linear model in Section 3.3, the derivation in this section is intimidating and even beyond the human ability within reasonable time period. For example, the calculation of $\dot{x}_{4d}$ would require compound function derivative calculation with trigonometric function existing in both the numerators and denominators. It is even more challenging to derive partial differentiation of $\dot{x}_{4d}$ w.r.t the error vectors. Thus, we first derive all the necessary components required to calculate the final result and input everything into the PC to complete the final symbolic and numerical evaluations.

The error vectors for nonlinear model (1) is

(45)

\begin{equation} \begin{cases} e_1 = -x_1 \\ e_2 = -x_2 \\ e_3 = \Delta t \cdot x_{4d} + (x_r-x_3) \\ e_4 = -x_4 \end{cases} \end{equation}

Differentiating Eq. (45) and replacing all state variables using $e_i,i=1,2,3,4$ :

(46)

\begin{equation} \begin{cases} \dot{e}_1 & = e_2 \\ \dot{e}_2 & = -\dot{x}_2 \\ & = \dfrac{\left ( \begin{array}{r} F_1(M_1+M_2)e_2-M_2^2l^2e_2^2 \sin (e_1) \cos (e_1)+F_0M_2le_4 \cos (e_1)\\ +(M_1+M_2)M_2gl \sin (e_1)+M_2l \cos (e_1)u \end{array} \right )}{p_2}\\ \dot{e}_3 &= \Delta t \cdot \dot{x}_{4d} + x_4 \\ \dot{e}_4 & = -\dot{x}_4 \\ & = -k_{p2}e_3-k_{d2}e_4\\ \end{cases} \end{equation}

In the following, the term $\dot{e}_3$ is managed separately due to its complexity in calculation. Rewriting Eq. (44) using $e_i,i=1,2,3,4$ . Hence,

(47)

\begin{equation} x_{4d}=\frac{A_2}{B_2}, \end{equation}

where $A_2$ and $B_2$ are the numerator and denominator, respectively, expressed as below:

(48)

\begin{align} A_2 = & p_2k_{p1}e_1+p_2k_{d1}e_2 - F_1(M_1+M_2)e_2+ (M_1+M_2)M_2gl \sin (e_1) + \frac{M_2l \cos (e_1)}{J+M_2l^2} \big [ F_1M_2le_2 \cos (e_1) \nonumber\\ &\qquad\qquad\qquad-F_0(J+M_2l^2)e_4- M_2^2gl^2 \sin (e_1) \cos (e_1)+p_2(k_{p2}e3+k_{d2}e4) \big ] \end{align}

(49)

\begin{equation} B_2 = F_0M_2l\cos (x_1) \end{equation}

Firstly, the term $p_2 = (M_1+M_2)(J+M_2l^2)-M_2^2l^2 \cos ^2(x_1)$ should be processed. Rewriting it using error variables and taking the derivative of it based on Eq. (45).

(50)

\begin{equation} p_2 = (M_1+M_2)(J+M_2l^2)-M_2^2l^2 \cos ^2(e_1) \end{equation}

(51)

\begin{equation} \dot{p}_2 = 2M_2^2l^2 \cos (e_1) \sin (e_1) e_2 \end{equation}

So the derivative of $A_2$ is

(52)

\begin{equation} \begin{split} \dot{A}_2&=\dot{p}_2k_{p1}e_1+p_2k_{p1}e_2 +\dot{p}_2k_{d1}e2+p_2k_{d1}\dot{e}_2-F_1(M_1+M_2)\dot{e}_2+(M_1+M_2)M_2gl\cos (e_1)e_2 \\ &\quad+\frac{M_2l}{J+M_2l^2}\Big [ F_1M_2l \big (\dot{e}_2\cos ^2(e_1)-2e_2 \cos (e_1) \sin (e_1)e_2 \big ) -F_0(J+M_2l^2) \big (\dot{e}_4\cos (e_1) -e_4 \sin (e_1)e_2 \big ) \\ &\quad-M_2^2gl^2 \big ({-}\sin (e_1)e_2\frac{\sin (2e_1)}{2}+\cos (e_1) \cos (2e_1)e_2 \big ) + k_{d2} \big ( (\dot{p}_2e_4+p_2 \dot{e}_4) \cos (e_1)-p_2e_4 \sin (e_1)e_2 \big ) \\ &\quad k_{p2} \big ( (\dot{p}_2 e_3+p_2(\Delta t \dot{x}_{4d} + e_4)) \cos (e_1)-p_2e_3 \sin (e_1)e_2 \big ) \Big ]. \end{split} \end{equation}

Noticeably, $\dot{x}_{4d}$ appears again in $\dot{A}_2$ , which means one more implicit equation to solve in order to calculate $\dot{x}_{4d}$ . Similarly, the derivative of $B_2$ is

(53)

\begin{equation} \dot{B}_2=-F_0M_2l \sin (e_1) e_2 \end{equation}

At last, the derivative of $x_{4d}$ can be formulated based on fractional derivation rule

(54)

\begin{equation} \dot{x}_{4d} = \frac{\dot{A}_2B_2-\dot{B}_2A_2}{B^2_2}. \end{equation}

Solving the above implicit equation arrives at the final equation:

(55)

\begin{equation} \dot{x}_{4d} = \frac{ (\dot{A}_2-p_3 \dot{x}_{4d})B_2-\dot{B}_2A_2}{B^2_2-p_3B_2} \end{equation}

(56)

\begin{equation} p_3 = \frac{M_2l}{J+M_2l^2}k_{p2}p_2\Delta t \cos (e_1) \end{equation}

Denote the vector field of Eq. (46) as $F_{\text{nonlinear}}(e_1,e_2,e_3,e_4)$

(57)

\begin{equation} F_{\text{nonlinear}}(e_1,e_2,e_3,e_4)= \left [ \begin{array}{c} e_2 \\ \dfrac{\left ( \begin{array}{r} F_1(M_1+M_2)e_2-M_2^2l^2e_2^2 \sin (e_1) \cos (e_1)+F_0M_2le_4 \cos (e_1)\\ +(M_1+M_2)M_2gl \sin (e_1)+M_2l \cos (e_1)u \end{array} \right )}{p_2} \\ \Delta t \dot{x}_{4d} + e_4\\ -k_{p2}e_3-k_{d2}e_4 \\ \end{array} \right ] \end{equation}

Then, the Jacobian matrix can be derived easily:

(58)

\begin{equation} DF_{\text{nonlinear}}(e_1,e_2,e_3,e_4)= \left [ \begin{array}{cccccccccc} 0 && & 1 &&& 0 &&& 0 \\[3pt] \displaystyle{\frac{\partial \dot{e}_2}{\partial e_1} }&&& \displaystyle{\frac{\partial \dot{e}_2}{\partial e_2} } &&& \displaystyle{\frac{\partial \dot{e}_2}{\partial e_3} } &&& \displaystyle{\frac{\partial \dot{e}_2}{\partial e_4} } \\[8pt] \displaystyle{\frac{\partial \dot{e}_3}{\partial e_1} } & &&\displaystyle{\frac{\partial \dot{e}_3}{\partial e_2} } & &&\displaystyle{\frac{\partial \dot{e}_3}{\partial e_3} } &&& \displaystyle{\frac{\partial \dot{e}_3}{\partial e_4} } \\[8pt] 0 &&& 0 &&& -k_{p2} &&& -k_{d2} \\ \end{array} \right ] \end{equation}

In Eq. (58), $\frac{\partial \dot{e}_i}{\partial e_j},i=2,3;j=1,2,3,4$ are the partial derivatives of corresponding terms w.r.t the error variables. Those are too complicated to be manually derived and are solved instead by the PC symbolically. It is noticeable that $[e_1,e_2,e_3,e_4]=[0,0,0,0]$ is a fixed point of the system (46). If all eigenvalues of Eq. (58) have negative real parts, then the system is sure to be locally asymptotic stable around the fixed point.

Remark 5 (Parameters selection): In this paper, an analytical solution to the specific range of those parameters that stabilizes the system is not provided, which is a potential research direction. However, a critical advantage of the proposed controller lies in similarly intuitive tuning process as conventional PID controller. With that being said, based on the derived Jacobian matrix, one can use whatever optimization method to find an approximate range of parameters that stabilizes the system by examining the eigenvalues of the resulting Jacobian matrix. In addition, to further understand the influence of each parameter on the system performance, an ablation study is implemented (around the chosen parameters) on both the baseline and proposed controller. The observation has been concluded in Table I. In the table, “+” means increased overshoot, increased oscillation or increased convergence rate under the increase of corresponding parameters accordingly, and vice versa for “−.” The results (see Appendix) also show that the chosen parameters are Pareto optimal. That is, by perturbing the current parameters, no further simultaneous improvement on the performance of $x_1$ and $x_3$ . This is also important to ensure that the baseline method has achieved its optimal performance.

Table I. Influence of parameter selection.

Remark 6 (Stability analysis using eigenvalues): In control theory, the Jacobian matrix of the closed system can be used to ensure stability, by examining that all eigenvalues have strictly negative real parts. Under this condition, the Jacobian matrix is called a stable matrix (or sometimes Hurwitz matrix) and that the system is asymptotically stable around the equilibrium points [Reference Khalil30].

4. Simulation

This section implements the proposed method to cartpole system and retrieve numerical simulation results as well as stability analysis. Firstly, necessary parameters for the dynamic model, simulation environment and cascaded PD controller are specified. Secondly, the simulation results are presented, closely followed by stability analysis. The results on linear approximated model are foremost presented and then comes the nonlinear model.

Table II. Parameter of dynamic model.

Table III. Parameter of simulation environment.

4.1. Parameters specification

Table II shows the parameters of the dynamic models (1) and (5). Table III lists the parameters of the simulation environment setup. We are using OpenAI Gym [Reference Brockman, Cheung, Pettersson, Schneider, Schulman, Tang and Zaremba31] environment to carry out the simulation. Gym is a popular simulation platform with both continuous and discrete environment setup written in Python. Table IV are the parameters of the proposed method and the baseline. Similarly, The initial states are set as $[x_1,x_2,x_3,x_4]=[0.5,0,0,0]$ . The reference signal for cartpole system is $[x_{1d},x_{2d},x_{d},x_{4r}]=[0,0,0.5,0]$ , which represent the desired angle and angular velocity are all 0. Naturally, it is hoped that the ending position of the cart is not too far away from its initial location, which means also that the linear velocity of the cart should converge to 0 with passage of time. Under these considerations, the cost of an episode is defined as

(59)

\begin{equation} L = -\sum _{\text{timestep} \,= \,0}^{800}e_1^2+0.1e_2^2+e_3^2, \end{equation}

then many optimization algorithms can be used to find the optimal controller parameters for cascaded PD controller. In this paper, Bayesian optimization is implemented as a baseline, plus manual fine-tune to determine the final parameters. Bayesian optimization is chosen because of its data efficiency in optimization process [Reference Neumann-Brosig, Marco, Schwarzmann and Trimpe32]. By constructing a surrogate probabilistic model, it can retrieve the next location with high probability of getting better result. However, one disadvantage is that it may only find a sub-optimal solution [Reference Solis and Thomas33]. Thus, using Bayesian optimization as a baseline helps us get closer to the optimal parameters quickly, and then the parameters are fine tuned manually aiming for best performance.

Table IV. Parameter of controllers.

4.2. Double-loop PD controller as baseline

Double-loop PD controller is a well-established method for the control of the cartpole systems and is also one of the most fundamental controller that inspires the invention of many other methods [Reference Wang, Sun and Zai34]. The intuitive idea of double-loop PD controller lies in using PD controller each for the stabilization of the pole and the cart, respectively. The final control input is a direction summation of those two PD controllers. The formula is represented as

(60)

\begin{equation} u_{\text{double}}=\bar{k}_{p1}e_1+\bar{k}_{d1}e_2+\bar{k}_{p2}e_3+\bar{k}_{d2}e_4, \end{equation}

where $e_1,e_2,e_3,e_4$ are errors for $x_1,x_2,x_3,x_4$ , respectively, and $\bar{k}_{p1}, \bar{k}_{d1}, \bar{k}_{p2}, \bar{k}_{d2}$ are the parameters. Conceivably, although this controller has been proved effective in practice, its implementation is too intuitive to exploit any information that the dynamical system has to provide.

4.3. Results and evaluation

4.3.1. Results of linear approximated model

The results of the linear approximated model are depicted in Figs. 4–8. Figure 4 is the output of the angle of the pole. Compared with the baseline, the proposed method outputs slightly higher magnitude and frequency oscillation, with shorter settling time. Figure 5 also coincides with Fig. 4 by showing corresponding oscillation in the velocity level. Figure 6 depicts the position of the cart, where the baseline method presents a very slow asymptotic convergence. Figure 7 shows the outputs of $x_4$ . Lastly, Fig. 8 presents how the torque input changes alongside the episode. It also oscillates at first and gradually converges to 0. A severe overshoot is observed for the baseline controller at the very beginning, but the oscillation is alleviated afterwards.

Figure 4. $x_1$ output of linear model.

Figure 5. $x_2$ output of linear model.

Figure 6. $x_3$ output of linear model.

Figure 7. $x_4$ output of linear model.

Figure 8. $u$ output of linear model.

Next, the Jacobian matrix of this linear system is to be investigated to understand some phenomena that happened in the simulation results. Substituting the controller parameters into Eq. (36), the Jacobian matrix can be calculated out:

(61)

\begin{equation} DF_{\text{linear}}(e_1,e_2,e_3,e_4)= \left [ \begin{array}{cccccccccc} 0 &&& 1 &&& 0 &&& 0 \\[4pt] 24.50 &&& -4.17 &&& 200.00 & &&212.95 \\[4pt] -4.75 &&& -0.62 &&& 71.26 &&& 75.58 \\[4pt] 0 &&& 0 &&& -80.00 &&& -85.18 \\ \end{array} \right ] \end{equation}

and the eigenvalues with corresponding eigenvectors are

(62)

\begin{equation} \begin{cases} \lambda _1 = -3.60 \\ \lambda _2 = -4.05+11.24 \mathrm{j} \\ \lambda _3 = -4.05-11.24 \mathrm{j} \\ \lambda _4 = -1.24 \\ \end{cases} \end{equation}

(63)

\begin{equation} \begin{cases} \vec{v}_1 = [-0.10 - 0.14 \mathrm{j},0.36 + 0.49 \mathrm{j},-0.37 - 0.50 \mathrm{j},0.30 + 0.40 \mathrm{j}]^T \\ \vec{v}_2 = [0.063 - 0.036 \mathrm{j}, 0.15 + 0.85 \mathrm{j},0.20 + 0.44 \mathrm{j}, -0.21 - 0.32 \mathrm{j}]^T \\ \vec{v}_3 = [-0.047 - 0.049 \mathrm{j}, -0.36 + 0.73 \mathrm{j},-0.30 + 0.34 \mathrm{j}, 0.27 - 0.23 \mathrm{j}]^T \\ \vec{v}_4 = [0.080 + 0.029 \mathrm{j},-0.099 - 0.036 \mathrm{j},0.94 + 0.34 \mathrm{j},-0.73 - 0.26 \mathrm{j}]^T \\ \end{cases} \end{equation}

In Eqs. (62) and (63), $\mathrm{j}$ is the imaginary unit and $T$ represents transpose of vectors. All the eigenvalues have negative real part, which ensure local stability near the equilibrium point.

Remark 7 (Global stability): The system is at least locally asymptotic stability but not guaranteed to be globally stable. The attempt to elevate the stability conclusion to global is intuitive and has its background from Markus-Yamabe’s theorem [Reference Feßler35]. However, Markus-Yamabe’s theorem only holds for second-order system, and many counterexamples have been discovered for higher-order systems [Reference Kuznetsov, Kuznetsova, Koznov, Mokaev and Andrievsky36]. With that being said, in the simulation, the system can be stabilized whatever the initial states, as long as the pole is placed within the upper half plane.

4.3.2. Results of nonlinear model

The results of the original nonlinear model are depicted in Figs. 9–20, which are similar to the linear case. Figure 9 is the output of the angle of the pole. Tt converges to 0 soon after some oscillation of decaying magnitude. Figure 10 shows the profile of the angular velocity of the pole, which shares similar pattern with Fig. 9. In comparison, the oscillation magnitude of baseline controller is similar to the proposed method, but with a smaller frequency and therefore slower convergence rate. Figure 11 depicts the position of the cart. The proposed method converges much faster than baseline controller without comprising the convergence performance of $x_1$ , while the baseline controller shows a very slow asymptotic convergence of $x_3$ . Figure 12 shows the outputs of $x_4$ . Lastly, Fig. 13 presents how the torque input changes alongside the episode. It also oscillates at first and gradually converge to 0. A big overshoot is rendered by the baseline controller.

Figure 9. $x_1$ output of nonlinear model.

Figure 10. $x_2$ output of nonlinear model.

Figure 11. $x_3$ output of nonlinear model.

Figure 12. $x_4$ output of nonlinear model.

Figure 13. $u$ output of nonlinear model.

Figure 14. Friction force analysis.

Figure 15. $x_1$ output of nonlinear model.

Figure 16. $x_2$ output of nonlinear model.

Figure 17. $x_3$ output of nonlinear model.

Figure 18. $x_4$ output of nonlinear model.

Figure 19. $u$ output of nonlinear model.

Next, the Jacobian matrix of this nonlinear system is to be investigated for local stability analysis. Substituting the controller parameters into Eq. (58), the Jacobian matrix near the fixed point $[e_1,e_2,e_3,e_4]=[0,0,0,0]$ can be calculated. The results are exactly the same with Eqs. (36)–(63). This also verifies the derivation process, since in the equilibrium point, the linear model should be equivalent to the nonlinear model.

4.3.3. Performance indices overview

To conclude the results showcase section, a performance overview of both linear and nonlinear models with two controllers respectively is shown in Table V. $t_1,t_2$ are the convergence time for the pole and the cart, respectively. $\text{MAE}_1, \text{MAE}_2$ are the mean absolute error of the pole and the cart respectively. “energy” means the mean square sum of control input, which represents the energy consumption of the controller. We can safely conclude that the proposed method is outstanding compared with double-loop PD controller in terms of convergence rates and tracking errors of $x_1,x_3$ . However, the ensued cost of superiority lies in increased control efforts and slightly severer oscillation. The advantages of proposed controller originate from the exploitation of the internal dynamics of the model through a semi-implicit process, thus a system-level consistent intermediate target is derived. However, for double-loop PD controller, the control efforts required by the cart and the pole are competing, resulting in a compromise between performance of those two and limiting the overall performance.

Table V. Performance indices overview.

Figure 20. Coulomb friction of nonlinear model.

4.3.4. Robust performance

To illustrate the robustness of the proposed controller, this subsection presents the results of simulation under both Coulomb friction and random noise. Coulomb friction is an approximation of dry friction in practice, including both the static friction and kinetic friction, with different coefficients. According to the Coulomb’s law of friction, the magnitude of the friction between two dry sliding surface is independent of the magnitude of the relative velocity. However, the direction of the friction is opposed to the relative velocity. Therefore, Coulomb friction is a highly nonlinear type of disturbance [Reference Lötstedt37]. Accordingly, the cartpole system dynamics with disturbance is

(64)

\begin{equation} \begin{cases} \dot{x}_1 = x_2 \\ \dot{x}_2= \dfrac{\left ( \begin{array}{r} -F_1(M_1+M_2)x_2-M_2^2l^2x_2^2 \sin (x_1) \cos (x_1)+F_0M_2lx_4 \cos (x_1)\\ +(M_1+M_2)M_2gl \sin (x_1)-M_2l \cos (x_1)(u-f_{\text{cart}})\end{array} \right )}{p_2} + \frac{f_{\text{pole}}R_{\text{joint}}}{J} + d_1 \\ \dot{x}_3 = x_4 \\ \dot{x}_4 = \dfrac{\left ( \begin{array}{r} F_1M_2lx_2 \cos (x_1)+(J+M_2l^2)M_2lx_2^2 \sin x_1 -F_0(J+M_2l^2)x_4\\ -M_2^2gl^2 \sin (x_1) \cos (x_1) +(J+M_2l^2)(u-f_{\text{cart}})\end{array} \right )}{p_2} + d_2\\ \end{cases} \end{equation}

where $f_{\text{cart}}$ is the friction acting on the cart because of rolling. This force will counteract the control input $u$ directly, and therefore $u$ is directly deducted by $f_{\text{cart}}$ . $f_{\text{pole}}$ is the friction acting on the revolute joint that connects the cart and the pole. To convert it into angular acceleration, it is multiplied by the radius of the joint $R_{\text{joint}}=0.01\,\text{m}$ and then divided by the inertia $J$ . $-1 \leq d_1,d_2 \leq 1$ are bounded random total disturbance added to the acceleration. The force analysis figure is plotted in Fig. 14.

According to the Coulomb friction theory, the friction is proportional to the normal force, and cannot revert the relative motions between two surfaces. Firstly, $f_{\text{cart}}$ is considered. The sliding surfaces are the wheels and the ground. This is a rolling motion, and the friction coefficient is chosen slightly smaller than sliding friction. The static friction coefficient $c_{\text{cart_static}}=0.2$ , and the kinetic friction coefficient $c_{\text{cart_kine}}=0.05$ . The normal force is affected by both the mass gravity and the lifting force generated by the centrifugal force of the pole, but cannot be negative. Accordingly, the normal force of the cart is:

(65)

\begin{equation} f_{\text{cart_nor}} = \max \Big [ 0, (M_1+M_2)g - \int _{r=0}^{l}\frac{M_2}{l}\dot{x}_2^2rdr \Big ] \end{equation}

On the other hand, $f_{\text{cart}}$ cannot revert the influence of $u$ , which means if $u-f_{\text{cart}}$ has different sign with $u$ , then $f_{\text{cart}}=u$ . Therefore, the total expression of $f_{\text{cart}}$ is:

(66)

\begin{equation} f_{\text{cart}}= \begin{cases} c_{\text{cart_static}} \times f_{\text{cart_nor}}, \ \text{if static and $ [(u\times (u-f_{\text{cart}})] \geq 0$} \\ c_{\text{cart_kine}} \times f_{\text{cart_nor}}, \ \ \ \text{if kinetic and $ [(u\times (u-f_{\text{cart}})] \geq 0$} \\ 0, \qquad \qquad \qquad \qquad \ \text{if $ [(u\times (u-f_{\text{cart}})] \lt 0$ }\\ \end{cases} \end{equation}

The $f_{\text{pole}}$ is modelled as follows. The normal force of the pole is a vector summation of the centrifugal force and the force generated by the cart acceleration. Therefore, the normal force should be expressed as:

(67)

\begin{equation} f_{\text{pole_nor}} = \left\| \left[ \int _{r=0}^{l}\frac{M_2}{l}\dot{x}_2^2rdr \cos (x_1), M_2 \dot{x}_4 - \int _{r=0}^{l}\frac{M_2}{l}\dot{x}_2^2rdr \sin (x_1) \right] \right\|_2 \end{equation}

Similarly, the full expression of $f_{\text{pole}}$ is:

(68)

\begin{align} f_{\text{pole}}= \begin{cases} c_{\text{pole_static}} \times f_{\text{pole_nor}}, & \text{if static } \\ c_{\text{pole_kine}} \times f_{\text{pole_nor}}, & \text{if kinetic} \end{cases} \end{align}

where the static friction coefficient $c_{\text{pole_static}}=0.5$ , and the kinetic one $c_{\text{pole_kine}}=0.3$ .

Figures 15–20 are the comparative results of the proposed controller and double-loop PD controller under added disturbance. In comparison with previous sections without disturbance, the results here are similar, only with some oscillation and chattering near the equilibrium point. This is due to the existence of friction and random noise, which slightly impairs the control performance. Nonetheless, the system is still stable under both controllers. Besides, Fig. 20 illustrates the Coulomb friction profile, which features abrupt change, nonlinearity and clipping as the theory suggests.

5. Conclusion

A control method for underactuated cartpole systems based on cascaded PD controller is proposed in this article. The gist is to transform the pole dynamics into a virtual PD controller, with the coupling term exploited as the design variable. The desired value of the coupling term $x_{4d}$ is then fed into the cart dynamics for the realization of a second PD controller. The expressions of the control input as well as $x_{4d}$ are derived by solving a semi-implicit equation. This method absorbs all the blessings that conventional PID controller has to offer (i.e., very simple in design and relatively intuitive to understand) and can be carried out on the original state-space equations without coordinate transformation, along with all the assumptions ensue. Besides, contrary to many other PID controller research, a stability analysis method for the fourth-order cascaded PD controller is proposed using Jacobian matrix of the residual system, although it only concludes locally asymptotic stability in this system and bears with it some drawbacks. The simulation results illustrate the advantages of proposed method in terms of stabilizing the cart and the pole simultaneously compared with widely used double-loop PD controller. In addition, the robustness against Coulomb friction and random noise is verified through simulation. The superiority is derived from the exploitation of internal dynamical structure of the system through solving a semi-implicit equation.

Considering that this is a preliminary research of a control method for underactuated cartpole systems using cascaded PD controller, there are many efforts in urgent need to solve the following problems. Firstly, a stability analysis approach is required that can reach the conclusion of global stability. For example, Lyapunov-based stability theorem may be a good alternative to Jacobian matrix-based method in this article. With that being said, in the numerical simulation, the cartpole system can be stabilized with a wide range of values of the initial states of the system. Noticeably, for some systems, the Jacobian matrix-based analysis can actually conclude global stability using relevant theorem proposed by Markus and Yamabe [Reference Markus and Yamabe38] for high-dimension systems. Moving one step forward, how to ensure that all eigenvalues of a high dimensional (>2) Jacobian matrix are negative everywhere is an open question. A closed-form calculation is obviously infeasible for complicated matrix like in Eq. (58). Secondly, although this paper is targeted on cartpole system only, the authors envision that the proposed method should be able to be implemented to other kinds of underactuated systems and be expanded to a class of underactuated systems. Last but not least, a systematic and theoretic way of parameter selection should be investigated. The method of tuning in this article is still a combination of Bayesian optimization and trials. To achieve this, a more capable method for stability proof should be employed, for example, Lyapunov stability theorem.

Appendix A: Ablation study of semi-implicit cascaded PD controller

This appendix illustrates the ablation study of the proposed method, where the parameters of the controllers are perturbed one by one and illustrate the outputs of $x_1,x_3$ in order to see the influence of each parameter. The presented results not only feature the process of manual tuning but also prove that the chosen parameters in the paper are the OPTIMAL ones, by showing that the perturbation of parameters can only render Pareto optimum w.r.t $x_1,x_3$ convergence. Of all the pictures, the green line most approximates the actual performance, which lies in the middle of the perturbation bounds. By showing that the increase of performance on $x_1/x_3$ means the decrease of the other, the authors make sure that the chosen parameters are nearly Pareto Optimal.

Figures 21 and 22 show the $x_1,x_3$ outputs respectively under the perturbation of $k_{p1}$ . When $k_{p1}$ increases, the convergence of $x_1$ is accelerated, and its oscillation is suppressed. However, the convergence rate of $x_3$ is decreased. Figures 23 and 24 show the $x_1,x_3$ outputs respectively under the perturbation of $k_{p2}$ , which shares the same discussion with $k_{p1}$ .

Figure 21. $x_1$ outputs perturbing $k_{p1}$ of proposed method.

Figure 22. $x_3$ outputs perturbing $k_{p1}$ of proposed method.

Figure 23. $x_1$ outputs perturbing $k_{p2}$ of proposed method.

Figure 24. $x_3$ outputs perturbing $k_{p2}$ of proposed method.

Figure 25. $x_1$ outputs perturbing $k_{d1}$ of proposed method.

Figure 26. $x_3$ outputs perturbing $k_{d1}$ of proposed method.

Figure 27. $x_1$ outputs perturbing $k_{d2}$ of proposed method.

Figure 28. $x_3$ outputs perturbing $k_{d2}$ of proposed method.

Figures 25 and 26 show the $x_1,x_3$ outputs respectively under the perturbation of $k_{d1}$ . When $k_{d1}$ increases, the convergence of $x_1$ is decelerated, with smaller oscillation. In the meantime, the convergence of $x_3$ is also deteriorated. Note that too high-frequency oscillation is unfavourable, and the chosen parameter actually strikes a balance by leaning to the convergence performance. Figures 27 and 28 show the $x_1,x_3$ outputs respectively under the perturbation of $k_{d2}$ , whose discussion is similar to $k_{d1}$ .

Appendix B: Ablation study of double-loop PD controller

This appendix illustrates the ablation study of baseline method, where the parameters of the controllers are perturbed one by one and illustrate the outputs of $x_1,x_3$ in order to see the influence of each parameter. The presented results not only feature the process of manual tuning but also prove that the chosen parameters in the paper are nearly the OPTIMAL ones, by showing that the perturbation of parameters can only render Pareto optimum w.r.t $x_1,x_3$ convergence.

Figures 29 and 30 show the $x_1,x_3$ outputs respectively under the perturbation of $k_{p1}$ . When $k_{p1}$ increases, the convergence of $x_1$ is accelerated, and its oscillation is suppressed. However, the convergence rate of $x_3$ is slightly decreased. Figures 31 and 32 show the $x_1,x_3$ outputs respectively under the perturbation of $k_{p2}$ , which shares the opposite discussion with $k_{p1}$ . When $k_{p2}$ increases, the performance of $x_1$ is worse at the cost of better $x_3$ convergence.

Figure 29. $x_1$ outputs perturbing $k_{p1}$ of baseline method.

Figure 30. $x_3$ outputs perturbing $k_{p1}$ of baseline method.

Figures 33 and 34 show the $x_1,x_3$ outputs respectively under the perturbation of $k_{d1}$ . The decrease of $k_{d1}$ leads to more precise tracking of $x_1$ . However, it takes longer for $x_3$ to reach the reference location. Figures 35 and 36 show the $x_1,x_3$ outputs respectively under the perturbation of $k_{d2}$ , which illustrates a trade-off between overshoot and settling time for $x_1$ and $x_3$ . The selected parameters achieve a middle performance.

Figure 31. $x_1$ outputs perturbing $k_{p2}$ of baseline method.

Figure 32. $x_3$ outputs perturbing $k_{p2}$ of baseline method.

Figure 33. $x_1$ outputs perturbing $k_{d1}$ of baseline method.

Figure 34. $x_3$ outputs perturbing $k_{d1}$ of baseline method.

Figure 35. $x_1$ outputs perturbing $k_{d2}$ of baseline method.

Figure 36. $x_3$ outputs perturbing $k_{d2}$ of baseline method.

References

Chen, Z., Liu, Y., He, W., Qiao, H.-Y. and Ji, H., “Adaptive-neural-network-based trajectory tracking control for a nonholonomic wheeled mobile robot with velocity constraints,” IEEE Trans. Ind. Electron. 68(10), 5057–5067 (2021).CrossRef Google Scholar

Heshmati-alamdari, S., Nikou, A. and Dimarogonas, D. V., “Robust trajectory tracking control for underactuated autonomous underwater vehicles in uncertain environments,” IEEE Trans. Autom. Sci. Eng. 18(3), 1288–1301 (2021).CrossRef Google Scholar

Liu, Y., Zhan, W., Xing, M., Wu, Y., Xu, R. and Wu, X., “Boundary control of a rotating and length-varying flexible robotic manipulator system,” IEEE Trans. Syst. Man Cybern. Syst. 52(1), 377–386 (2022).CrossRef Google Scholar

Messikh, L., Guechi, E.-H. and Blai, S., “Stabilization of the cart-inverted-pendulum system using state-feedback pole-independent MPC controllers,” Sensors (Basel, Switzerland) 22(1), 243 (2021).CrossRef Google Scholar PubMed

Banerjee, R. and Pal, A., “Stabilization of Inverted Pendulum on Cart Based on Pole Placement and LQR,” In: 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET) (2018) pp. 1–5.Google Scholar

Balaga, H. and Deepthi, M., “Stabilization of Cart Inverted Pendulum System Using LQR, Two-Loop Pid, and Regional Pole Placement Techniques,” In: 2021 Asian Conference on Innovation in Technology (ASIANCON) (2021) pp. 1–6.Google Scholar

Eizadiyan, M. A. and Naseriyan, M., “Control of inverted pendulum cart system by use of PID controller,” Sci. Int. 27(2), 1063–1068 (2015).Google Scholar

Slotine, J.-J. E. and Li, W., Applied nonlinear control (Prentice Hall, Englewood Cliffs, NJ, 1991).Google Scholar

Shao, Y. and Li, J., “Modeling and switching tracking control for a class of cart-pendulum systems driven by DC motor,” IEEE Access 8, 44858–44866 (2020).CrossRef Google Scholar

Jiangand, J. and Astolfi, A., “Stabilization of a class of underactuated nonlinear systems via underactuated back-stepping,” IEEE Trans. Autom. Control 66(11), 5429–5435 (2020).CrossRef Google Scholar

Lakmesari, S. H., Mahmoodabadi, M. J. and Ibrahim, M. Y., “Fuzzy logic and gradient descent-based optimal adaptive robust controller with inverted pendulum verification,” Chaos Solitons Fractals 151, 111257 (2021).CrossRef Google Scholar

Dao, P. N. and Liu, Y.-C., “Adaptive reinforcement learning strategy with sliding mode control for unknown and disturbed wheeled inverted pendulum,” Int. J. Control Autom. Syst. 19(2), 1139–1150 (2020).CrossRef Google Scholar

Ordaz, P. and Poznyak, A., “‘KL’-gain adaptation for attractive ellipsoid method,” IMA J. Math. Control Inf. 32(3), 447–469 (2015).CrossRef Google Scholar

Kennedy, E. A., King, E. and Tran, H., “Real-time implementation and analysis of a modified energy based controller for the swing-up of an inverted pendulum on a cart,” Eur. J. Control 50, 176–187 (2019).CrossRef Google Scholar

Ranasinghe, R., Manoharan, P., Pallegedara, A. and Kodithuwakku, D., “Develop a Cascaded Observer Based Controller to an Inverted Pendulum Platform for Experimental Research and Teaching,” In: 2022 IEEE IAS Global Conference on Emerging Technologies (GlobConET) (2022) pp. 215–220.CrossRef Google Scholar

Ratolikar, M. D. and Kumar, R. P., “Neural Network Control of an Inverted Pendulum on a Two DOF Cart Moving in the Vertical Plane,” In: 2021 6th International Conference on Robotics and Automation Engineering (ICRAE) (2021) pp. 84–88.CrossRef Google Scholar

Peng, Z., Jiang, Y. and Wang, J., “Event-triggered dynamic surface control of an underactuated autonomous surface vehicle for target enclosing,” IEEE Tran. Ind. Electron. 68(4), 3402–3412 (2021).CrossRef Google Scholar

Rojsiraphisal, T., Mobayen, S., Asad, J. H., Vu, M. T., Chang, A. and Puangmalai, J., Fast terminal sliding control of underactuated robotic systems based on disturbance observer with experimental validation,” Mathematics 9(16), 1935 (2021).CrossRef Google Scholar

Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N. M. O., Erez, T., Tassa, Y., Silver, D. and Wierstra, D., “Continuous control with deep reinforcement learning,” CoRR, abs/1509.02971 (2015).Google Scholar

Shi, Q., Lam, H.-K., Xuan, C. and Chen, M., “Adaptive neuro-fuzzy PID controller based on twin delayed deep deterministic policy gradient algorithm,” Neurocomputing 402, 183–194 (2020).CrossRef Google Scholar

Hiremath, S. A. and Bajçinca, N., “DNN Based Learning Algorithm for State Constrained Stochastic Control of a 2D Cartpole System,” In: 2022 European Control Conference (ECC) (2022) pp. 1132–1139.CrossRef Google Scholar

Lam, H. K. and Leung, F. H. F., “Fuzzy controller with stability and performance rules for nonlinear systems,” Fuzzy Set. Syst. 158(2), 147–163 (2007).CrossRef Google Scholar

Chen, C. T., Linear System Theory and Design, The Oxford Series in Electrical and Computer Engineering (Oxford University Press, Oxford, 2014).Google Scholar

Tomei, P., “A simple PD controller for robots with elastic joints,” IEEE Trans. Autom. Control 36(10), 1208–1213 (1991).CrossRef Google Scholar

Higham, N. J., “Gaussian elimination,” Wiley Interdiscip. Rev. Comput. Stat. 3(3), 230–238 (2011).CrossRef Google Scholar

Deng, C. and Liu, W., “Semi-implicit Euler-Maruyama method for non-linear time-changed stochastic differential equations,” BIT Numer. Math. 60, 1133–1151 (2019).CrossRef Google Scholar

Zhao, C. and Guo, L., “On the capability of PID control for nonlinear uncertain systems,” IFAC-PapersOnLine 50(1), 1521–1526 (2017).CrossRef Google Scholar

Andrade, F. A. A., Guedes, I. P., Carvalho, G. F., Zachi, A. R. L., Haddad, D. B., Almeida, L. F., de Melo, A. G. and Pinto, M. F., “Unmanned aerial vehicles motion control with fuzzy tuning of cascaded-PID gains,” Machines 10(1), 12 (2021).CrossRef Google Scholar

Meurer, A., Smith, C. P., Paprocki, M., Čertík, Ořej, Kirpichev, S. B., Rocklin, M., Kumar, A. M.iT., Ivanov, S., Moore, J. K., Singh, S., Rathnayake, T., Vig, S., Granger, B. E., Muller, R. P., Bonazzi, F., Gupta, H., Vats, S., Johansson, F., Pedregosa, F., Curry, M. J., Terrel, A. R., Roučka, Štěpán, Saboo, A., Fernando, I., Kulal, S., Cimrman, R. and Scopatz, A., “Sympy: Symbolic computing in python,” PeerJ Comput. Sci. 3, e103 (2017).CrossRef Google Scholar

Khalil, H. K., Nonlinear Systems, 3rd edition (Prentice Hall, Upper Saddle River, NJ, 2008).Google Scholar

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J. and Zaremba, W., “Openai gym,” arXiv preprint, arXiv:1606.01540 (2016).Google Scholar

Neumann-Brosig, M., Marco, A., Schwarzmann, D. and Trimpe, S., “Data-efficient autotuning with bayesian optimization: An industrial control study,” IEEE Trans. Control Syst. Technol. 28(3), 730–740 (2020).CrossRef Google Scholar

Solis, M. A. and Thomas, S. S., “Generalized state-feedback controller synthesis for underactuated systems through bayesian optimization,” ArXiv, abs/2103.17158 (2021).Google Scholar

Wang, X., Sun, Z. and Zai, S., “Application of Double-Loop PID Controller in the Inversed Pendulum Real-Time Control System,” In: Green Communications and Networks: Proceedings of the International Conference on Green Communications and Networks (GCN 2011) (Springer, Dordrecht, 2012) pp. 619–626.CrossRef Google Scholar

Feßler, R., “A proof of the two-dimensional markus-yamabe stability conjecture and a generalization,” Ann. Pol. Math. 62(1), 45–74 (1995).CrossRef Google Scholar

Kuznetsov, N. V., Kuznetsova, O. A., Koznov, D., Mokaev, R. N. and Andrievsky, B., “Counterexamples to the Kalman conjectures,” IFAC-PapersOnLine 51(33), 138–143 (2018). 5th IFAC Conference on Analysis and Control of Chaotic Systems CHAOS 2018.CrossRef Google Scholar

Lötstedt, P., “Coulomb friction in two-dimensional rigid body systems,” ZAMM 61, 605–615 (1981).CrossRef Google Scholar

Markus, L. and Yamabe, H., “Global stability criteria for differential systems,” Osaka Math. J. 12(2), 305–317 (1960).Google Scholar

Figure 1. Conceptual structure of cartpole system.

Figure 2. Conceptual structure of conventional PD controller.

Figure 3. Framework of semi-implicit cascaded PD controller design.

Table I. Influence of parameter selection.

Table II. Parameter of dynamic model.

Table III. Parameter of simulation environment.

Table IV. Parameter of controllers.

Figure 4. $x_1$ output of linear model.

Figure 5. $x_2$ output of linear model.

Figure 6. $x_3$ output of linear model.

Figure 7. $x_4$ output of linear model.

Figure 8. $u$ output of linear model.

Figure 9. $x_1$ output of nonlinear model.

Figure 10. $x_2$ output of nonlinear model.

Figure 11. $x_3$ output of nonlinear model.

Figure 12. $x_4$ output of nonlinear model.

Figure 13. $u$ output of nonlinear model.

Figure 14. Friction force analysis.

Figure 15. $x_1$ output of nonlinear model.

Figure 16. $x_2$ output of nonlinear model.

Figure 17. $x_3$ output of nonlinear model.

Figure 18. $x_4$ output of nonlinear model.

Figure 19. $u$ output of nonlinear model.

Table V. Performance indices overview.

Figure 20. Coulomb friction of nonlinear model.

Figure 21. $x_1$ outputs perturbing $k_{p1}$ of proposed method.

Figure 22. $x_3$ outputs perturbing $k_{p1}$ of proposed method.

Figure 23. $x_1$ outputs perturbing $k_{p2}$ of proposed method.

Figure 24. $x_3$ outputs perturbing $k_{p2}$ of proposed method.

Figure 25. $x_1$ outputs perturbing $k_{d1}$ of proposed method.

Figure 26. $x_3$ outputs perturbing $k_{d1}$ of proposed method.

Figure 27. $x_1$ outputs perturbing $k_{d2}$ of proposed method.

Figure 28. $x_3$ outputs perturbing $k_{d2}$ of proposed method.

Figure 29. $x_1$ outputs perturbing $k_{p1}$ of baseline method.

Figure 30. $x_3$ outputs perturbing $k_{p1}$ of baseline method.

Figure 31. $x_1$ outputs perturbing $k_{p2}$ of baseline method.

Figure 32. $x_3$ outputs perturbing $k_{p2}$ of baseline method.

Figure 33. $x_1$ outputs perturbing $k_{d1}$ of baseline method.

Figure 34. $x_3$ outputs perturbing $k_{d1}$ of baseline method.

Figure 35. $x_1$ outputs perturbing $k_{d2}$ of baseline method.

Figure 36. $x_3$ outputs perturbing $k_{d2}$ of baseline method.

Article contents

Design and stability analysis of semi-implicit cascaded proportional-derivative controller for underactuated cart-pole inverted pendulum system

Abstract

Keywords

1. Introduction

2. Preliminaries

2.1. Dynamic model description

2.2. Conventional PD controller

3. Controller design and analysis

3.1. Controllability analysis around equilibrium point

3.2. Framework overview

3.3. Semi-explicit cascaded PD controller design for linear approximated model

3.4. Stability analysis using Jacobian matrix for linear approximated model

3.5. Semi-explicit cascaded PD controller design for nonlinear model

3.6. Stability analysis using Jacobian matrix for nonlinear model

4. Simulation

4.1. Parameters specification

4.2. Double-loop PD controller as baseline

4.3. Results and evaluation

4.3.1. Results of linear approximated model

4.3.2. Results of nonlinear model

4.3.3. Performance indices overview

4.3.4. Robust performance

5. Conclusion

Appendix A: Ablation study of semi-implicit cascaded PD controller

Appendix B: Ablation study of double-loop PD controller

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests