Design of an optimized gait planning generator for a quadruped robot using the decision tree and random forest workspace model

Yifan Wu; Sheng Guo; Zheqi Yu; Peiyi Wang; Lianzheng Niu; Majun Song

doi:10.1017/S0263574723001261

Design of an optimized gait planning generator for a quadruped robot using the decision tree and random forest workspace model

Published online by Cambridge University Press: 18 October 2023

Yifan Wu

Sheng Guo ,

Zheqi Yu

Peiyi Wang ,

Lianzheng Niu and

Majun Song

Show author details

Yifan Wu: Affiliation:
School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, 100044, China
Sheng Guo*: Affiliation:
School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, 100044, China
Zheqi Yu: Affiliation:
School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, 100044, China
Peiyi Wang: Affiliation:
School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, 100044, China
Lianzheng Niu: Affiliation:
School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, 100044, China
Majun Song: Affiliation:
Hangzhou Innovation Institute, Beihang University, Hangzhou, Zhejiang, 310051, China
*: Corresponding author: Sheng Guo; Email: shguo@bjtu.edu.cn

Article contents

Abstract
Introduction
Overview of the GPG
Data collection and labeling for machine learning
Training of the DT and RF workspace model for GPG
The PID inverse kinematics model of the leg in the GPG
The PID inverse kinematics model of the leg in the GPG
Discussion
Conclusion
Financial support
Competing interests
Ethical approval
Supplementary material
References

Rights & Permissions

Abstract

Real-time gait trajectory planning is challenging for legged robots walking on unknown terrain. In this paper, to realize a more efficient and faster motion control of a quadrupedal robot, we propose an optimized gait planning generator (GPG) based on the decision tree (DT) and random forest (RF) model of the robot leg workspace. First, the framework of this embedded GPG and some of the modules associated with it are illustrated. Aiming at the leg workspace model described by DT and RF used in GPG, this paper introduces in detail how to collect the original data needed for training the model and puts forward an Interpolation Labeling with Dilation and Erosion (ILDE) data processing algorithm. After the DT and RF models are trained, we preliminarily evaluate their performance. We then present how these models can be used to predict the location relation between a spatial point and the leg workspace based on its distributional features. The DT model takes only 0.00011 s to process a sample, while the RF model can give the prediction probability. As a complement, the PID inverse kinematic model used in GPG is also mentioned. Finally, the optimized GPG is tested during a real-time single-leg trajectory planning experiment and an unknown terrain recognition simulation of a virtual quadrupedal robot. According to the test results, the GPG shows a remarkable rapidity for processing large-scale data in the gait trajectory planning tasks, and the results can prove it has an application value for quadruped robot control.

Keywords

quadruped robot robot workspace locomotion control gait planning decision tree random forest

Type: Research Article
Information: Robotica , Volume 41 , Issue 12 , December 2023 , pp. 3746 - 3771

DOI: https://doi.org/10.1017/S0263574723001261 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press

1. Introduction

Legged robots with bionic structures are expected to be used as a new mobile platform in a more complex working environment. As one of the representative-legged robots, quadruped robots have developed rapidly in recent years, and their motion modes and application forms are also diverse. MIT-cheetah [Reference Nguyen, Powell, Katz, Carlo and Kim1–Reference Katz, Carlo and Kim2] can run at high speeds. With an excellent bouncing ability, SpaceBok [Reference Arm, Zenkl, Barton, Beglinger, Dietsche, Ferrazzini, Hampp, Hinder, Huber, Schaufelberger, Schmitt, Sun, Stolz, Kolvenbach and Hutter3] has been developed for Mars missions. And the outstanding terrain adaptation ability makes ANYmal [Reference Hutter, Gehring, Lauber, Gunther, Bellicoso, Tsounis, Fankhauser, Diethelm, Bachmann, Bloesch, Kolvenbach, Bjelonic, Isler and Meyer4–Reference Buchanan, Wellhausen, Bjelonic, Bandyopadhyay, Kottege and Hutter5] be applied to the exploration of various artificial and natural environments.

There are some commonly used algorithms for control-legged robots like Central Pattern Generator(CPG) [Reference Fukuoka, Kimura, Hada and Takase6–Reference Kimura and Fukuoka7], Virtual Model Control [Reference Raibert and Tello8], and Zero Moment Point [Reference Yang9]. On the basis of these, with the rising requirements for a more intelligent gait or whole-body control, much research has been put forward for the detailed technological parts of the legged robots’ locomotion. Omori et al. [Reference Omori, Kojio, Ishikawa, Kojima, Sugai, Kakiuchi, Okada and Inaba10] used reaction force on the foothold to plan the footstep of a bipedal robot. Bezier curve and other spline curves are used to achieve high-speed, efficiency, and stable running gaits for a cheetah like robot [Reference Hyun, Seok, Lee and Kim11–Reference Zeng, Zhang, Zhang, Li, Zhou and Fu12]. Fankhauser et al. [Reference Fankhauser, Bjelonic, Bellicoso, Miki and Hutter13] used RGBD sensors for quadruped robot to perceive rough terrain and the nearest foothold. Tieck et al. [Reference Tieck, Rutschke, Kaiser, Schulze, Buettner, Reichard, Roennau and Dillmann14] presented a behavior-based architecture to model locomotion for six-legged robots. Also for hexapod robot, Chen et al. [Reference Chen and Gao15] provided a time-optimal trajectory planning method for the body and leg-ends. With the development of control techniques, gait optimization problems are beginning to adopt more intelligent approaches. A hierarchical controller for quadruped robots has been proposed [Reference Kim, Son and Lee16], which can generate multiple gaits while tracking speed commands, it is more efficient than the controller composed of a single strategy. Shao et al. [Reference Shao, Jin, Liu, He, Wang and Yang17] propose a framework based on reinforcement learning to train a simple control strategy, which can enable robots to execute speed commands smoothly in the natural environment and use the learned motor skills to assist walking on rugged terrain. Li et al. [Reference Li, Gao, Li, Wang and Guo18] proposed another control algorithm based on hierarchy, which generates static gait by searching the trajectory of the robot’s center of gravity, thus reducing the space of gait search and reducing the time complexity of the algorithm. Takei et al. [Reference Takei, Morishita, Tazawa, Katsuya and Saito19] proposed an active gait generation method for quadruped robots, which adopted a pulse-type neuron model to effectively reduce the robot’s leg movement speed according to foot pressure. As the complexity of these methods increases, it becomes crucial for the robotic system to be able to process a variety of sensor signals while generating smooth actions. This requires higher speed and efficiency in data processing and correspondingly higher hardware requirements. However, it is worth noting that in the gait or trajectory-planning process of legged robots, optimization work is often focused on the body and leg state, while the workspace [Reference Gupta20], which can be a basic and essential feature to improve the gait design, is frequently overlooked.

For quadruped, the workspace of its leg is the set of all spatial positions that can be reached by the leg-end under the performance and structural limitations. The legs of most quadruped robots have a simple series structure, for this type of mechanism, there are several methods to determine its workspace. Kumar and Waldron [Reference Kumar and Waldron21] proposed an algorithm for tracing the bounding surfaces of workspaces. Ricard and Gosselin [Reference Ricard and Gosselin22] used joint limit to give a more classic way to get a set of envelope curve equations of workspace. The robot workspace can be also determined based on the intersection of the ray beam from a radiation point [Reference Snyman, du Plessis and Duffy23]. And there was a method based on random probability to generate the boundary curve of robot working plane [Reference Cao, Qi, Lu, Zang and Yang24]. For robots performing adaptive gait tasks, sensors are often used to transform the ground surface into point clouds, whose distribution properties are often analyzed in conjunction with the workspace. The methods mentioned above can be used to accurately describe the workspace of robot legs in general, but at the same time, the following two problems remain. First, these workspaces can often only be represented by mathematical equations in the controller. To determine whether a spatial point is inside the leg’s workspace, it needs to call different operation instructions. As the number of spatial points to deal with increases, it can take up lots of computing resources in real time, which leads to the reduction of robot locomotion speed. Second, for tasks that require real-time generation of different leg-end trajectories, it is important to determine the position of each foothold and the locations of points of the trajectories. In a practical engineering setting, the performance of the robot’s kinematics and dynamics at the boundary of the leg’s workspace is often suboptimal. Therefore, the approximate position of a point within the workspace also assumes importance. However, with current methods, it is challenging to intuitively determine if a spatial point is located in close proximity to the boundary of the workspace.

Based on the previous description, consider the following scenario. When a robot is performing mapping or surveying tasks in the wild, it needs to maintain body balance in real time. In this case, the robot’s gait planning generator (GPG) needs to control the gait frequency and generate corresponding foot trajectories based on the ground information. To adapt to uneven terrain, the robot needs to quickly find a suitable foothold point in the landing zone. Since the point cloud ground information generated by structured optical cameras can contain a large amount of data, it is desirable for AI to quickly identify candidate foothold features with respect to the robot leg workspace. This enables fast selection of more suitable foothold points, reducing foot slip to the ground while reducing the computational load on the robot and therefore increasing the stability of its gait. Therefore, this paper proposes an optimized CPG embedded with the AI-described workspaces of the robot leg. For the classical machine learning has shown the ability of building adaptive control system for legged robot [Reference Benbrahim and Franklin25–Reference Song, Zuo, Wu, Liu and Li29], this paper uses the decision tree (DT) and random forest (RF) method to train the AI-described workspace model. Based on the classical AI model combined with the workspace characteristics, the gait and leg-end trajectory execution ability of quadruped robot can be innovatively optimized. In order to generate training data set, an ILDE data processing algorithm is introduced. Experiment and simulation are designed to demonstrate the accuracy and speed of the GPG. The performance of GPG shows that it has great potential to help a quadruped robot to make a high-speed and effective locomotion planning like quickly find a foothold or quickly do a leg-end trajectory planning. Advantages of the GPG proposed in this paper include:

1. The DT and random forest described workspace occupies less storage space and is easy to call. It is friendly to embedded systems.
2. According to the prediction, the DT and random forest described workspace enables GPG to quickly determine the reachability of a large number of arbitrary spatial positions.
3. The optimized GPG can determine the proximity of any spatial position to the workspace boundary.

The rest of the paper is organized as follows: Section 2: The partial robot system framework with the GPG is presented. Section 3: Detailed training set raw data collection and labeling methods are provided. Section 4: The workspace models trained using DT and random forest methods are evaluated. Section 5: The principle of the PID inverse kinematics model is mentioned. Section 6: An experiment and a simulation are designed to verify the performance of the optimized GPG. Section 7: A discussion based on the SWOT (Strengths Weakness Opportunity Threats) analysis is developed. Section 8: The conclusions are drawn.

2. Overview of the GPG

In this paper, the main function of GPG is to plan and generate leg motion trajectories according to the instructions of the robot motion controller and the surrounding environment (As shown in Fig. 1).

Figure 1. The partial robot system framework with the GPG.

A control case will be provided to demonstrate in detail how GPG works: Before the quadruped robot performs its gait, the robot motion controller will send instructions to the trajectory generation module in GPG. The robot perception system will transmit the surrounding environment features, which are converted into point clouds or other data forms, to the trajectory generation module. Then the trajectory generation module will generate an initial foot trajectory and a foothold in Cartesian space and package them together with the environment points to the decision module. The decision module is the core module of GPG, and its main function is to predict the reachability of environmental and foothold points and judge the rationality of trajectories. It also has the ability to directly modify the initial trajectory. The decision module embeds the legs’ workspace models described by DT and random forest. These machine learning models are trained by external computers (the training method will be detailed in the next chapter) to speed up the data processing by GPG and hence optimize GPG itself. For the environment points, the decision module can rapidly predict their position relative to the workspace and judge their reachability. According to the distribution characteristics of the environment points, it can modify the initial trajectory within a certain limit. If the environment cannot enable the robot to walk normally, the decision module will send the marked environment points to the trajectory generation module, so that the robot can adopt other motion forms or stop moving. If the robot can walk normally, the decision module will send the modified trajectory points to the PID inverse kinematics model of the leg.

The PID inverse kinematics model will transform the Cartesian space foot trajectory into joint space. Meanwhile, for important spatial locations such as footholds, the PID inverse kinematics model can verify the prediction results of them from the decision module. It can guarantee the safety of the robot locomotion, and the wrong predicted data is fed backward to the external computer to correct the leg’s workspace model, so as to form a closed loop. After PID inverse kinematics verification, the determined foothold or trajectory point will be returned to the trajectory generation module. GPG packages the gait trajectory and other data to the robot motion controller for execution, then the robot can complete a gait.

The above optimized GPG logic is simple, which is beneficial to gait and leg-end trajectory planning of quadruped robots. For the workspace described by the DT and random forest used in the decision module, this paper will provide detailed machine learning data labeling and training process.

3. Data collection and labeling for machine learning

In this chapter, we mainly introduce how to collect the original data of the robot workspace, propose an ILDE algorithm to interpolate and label the original data, and ultimately obtain the appropriate training set data. This training set data directly determines the performance of random forest and DT models.

3.1. Original points in the workspace

The quadruped robot used in this study has a concise appearance design, as shown in Fig. 2. It has four leg mechanisms with the same series structure. In the actual control task, according to the leg installation position, GPG can complete gait planning according to the workspace model of one leg on each side.

Figure 2. The prototype of a quadruped robot equipped with a depth camera (left) and the model of its right front leg (right).

Taking its right front leg as an example, we will present how to obtain the original points of the leg workspace. In Fig. 2, the leg has four motors, three make up the shoulder joint and one is the elbow joint. O ₀ is the origin of the leg, which connects to the robot body, x ₀ is the robot’s forward direction, y ₀ is the left direction of the robot, and z ₀ is perpendicular to the fuselage.

Using the DH method, it is easy to get the position and orientation of the leg-end in the base coordinate system x ₀-y ₀-z ₀. When the angles of the leg joints are fixed, the position of the leg-end is determined uniquely in the Cartesian coordinate system. Then the workspace of the leg can be analyzed according to the range of motion of each joint. The ranges of motion of the four joints q ₁, q ₂, q ₃ and q ₄ are:

\begin{equation*} -45^{\circ }\leq q_{1}\leq 45^{\circ };-30^{\circ }\leq q_{2}\leq 30^{\circ };-60^{\circ }\leq q_{3}\leq 30^{\circ };0\leq q_{4}\leq 90^{\circ } \end{equation*}

Using 0.4 ∼ 0.05 rad as the sampling interval to traverse the range of motion of all joints, it can get a set of locations of the leg-end in Cartesian space as shown in Fig. 3. These points in Cartesian space are the original points. With the decrease of the sampling interval, the number of the obtained points increases exponentially, and the envelope of the point clouds will be closer to the real workspace. On the one hand, enough original points are required to effectively approximate the shape of the workspace. On the other hand, too many sampling times will consume a lot of computing resources, the relationship between the two should be balanced (in this research, we used a sampling interval of 0.1 rad).

Figure 3. The original data points sampled from the workspace. (a) Sampling interval: 0.4 rad, (b) Sampling interval: 0.2 rad, (c) Sampling interval: 0.1 rad, (d) Sampling interval: 0.05 rad.

Utilizing the forward kinematics model of the robot leg can rapidly and massively gain the original point data in its workspace. But it can be found that although the sampling in the joint space is uniform, the corresponding sampling points in Cartesian space are not evenly distributed, some areas still have a large area of vacancy. This original point cloud with irregular spatial distribution must be processed to become the training set data for machine learning.

3.2. Interpolation and labeling of the original data

Observing the spatial distribution of the original point cloud of the workspace, there are gaps and holes in the data space which is difficult for directly use of supervised machine learning. In this section, an Interpolation with Dilation and Erosion (ILDE) algorithm is presented in detail to process the original data.

First, the interpolation method is used to obtain a set of points uniformly distributed in space. We label the original points as P-cluster. The following conventions are employed:

n_P : the number of the P-cluster points.

CirCube: the circumscribed cube of the P-cluster point cloud.

V _CC: the volume of CirCube.

EnvGeo: the envelope geometry of the P-cluster point cloud.

V _EG: the volume of EnvGeo.

$\overline{\rho}_{\textrm{EG}}$ : the average density of the P-cluster in EnvGeo, $\overline{\rho}_{\textrm{EG}}={n}_{{P}}/{V}_{\textrm{EG}}$ .

Suppose there are n_P P-cluster points, we define the circumscribed cube and the envelope geometry of the P-cluster point cloud are CirCube and EnvGeo (shown in Fig. 4a, c), new unlabeled points need to be uniformly inserted in CirCube. Calculate the volumes V _CC and V _EG, and the density $\overline{\rho }_{\textrm{EG}}$ . Suppose that the density of the inserted unlabeled point is $\rho _{\textrm{CC}}$ and its ratio to $\overline{\rho }_{\textrm{EG}}$ is $k_{\rho }=\rho _{\textrm{CC}}/\overline{\rho }_{\textrm{EG}}$ . Then, the number of unlabeled points inserted is:

Figure 4. Schematic diagram of the preliminary interpolation and labeling process. (a) The interpolation of unlabeled points, (b) the statistical data of nu, (c) the initial labeling process, (d) he initial labeling result.

(1)

\begin{equation} n_{\textrm{CC}}=k_{\rho }\frac{{V}_{\textrm{CC}}}{{V}_{\textrm{EG}}}{n}_{{P}} \end{equation}

In this research, $k_{\rho }=1$ , the distance between an unlabeled point and its neighbors is denoted as r _CC when they are uniformly inserted into CirCube. The schematic diagram of the interpolation result is shown in Fig. 4a, the points in rose color are the P-cluster points and the hollow points are the inserted unlabeled points. Next, according to EnvGeo, the unlabeled points outside EnvGeo are labeled as N-cluster points. P-cluster means those points that are absolutely inside the workspace, while N-cluster points are absolutely outside the workspace. In order to have the ability to determine whether a point is close to the boundary of the workspace, the remaining unlabeled points need to be labeled as follows.

Inspired by the DBSCAN algorithm [Reference Zhang and Zhou30–Reference Chih-Wei and Chih-Jen31], unlabeled points are labeled according to the number of P-cluster points in their neighborhood. Define the neighborhood of each unlabeled point as a sphere with the radius of r _CC. Next, we define n_Pn to represent the number of P-cluster points contained in the neighborhood of each unlabeled point. According to different n_Pn , the number of points of the corresponding unlabeled points n_u is counted, and it can get:

(2)

\begin{equation} \sum n_{{u}}n_{{Pn}}={n}_{{Ps}} \end{equation}

where n_Ps is the statistically significant total number of P-cluster points. For the unlabeled points whose n_Pn = 0, they will be labeled to cluster F, which means they are almost out of the workspace. Excluding the case of n_u = 0, the statistical data of n_u is shown in Fig. 4b. The following function G(x) is applied to split this data according to the proportion of the total number of P-cluster points:

\begin{equation*} {G}\left(x\right)=\frac{\sum _{n_{Pn}=1}^{n_{Pn}=x}n_{u}n_{Pn}}{{n}_{{Ps}}},0\lt {G}\left(x\right)\leq 1 \end{equation*}

Figure 5. Schematic diagram and real implementation result using the ILDE algorithm to redistribute different clusters of points. (a) Data after the initial labeling process, (b) data after the 1st dilation process, (c) data after the 1st erosion process, (d) data after the 2nd dilation process.

The $x\in Z$ denotes where the data are split. Find x ₁ satisfies $\{x=x_{1}|G(x)=1/8\}$ , for the unlabeled points whose $n_{Pn}\leq x_{1}$ , they will be labeled as cluster C, it means that they are may be in the workspace. Then, find x ₂ satisfies $\{x=x_{2}|1-G(x)=1/8\}$ , for the unlabeled points whose $n_{Pn}\geq x_{2}$ , they will be labeled into cluster A, and it means that they are in the workspace with a high probability. The remaining unlabeled points are automatically labeled to B-cluster, they should be in the workspace. In our study, x ₁ = 8, x ₂ = 55. This labeling process is shown as Fig. 4c. Based on the possibility of a point in the workspace, all clusters are ranked from high to low as P, A, B, C, F, N. The schematic diagram of this clustering result is shown in Fig. 4d, we use rose, yellow, green, blue, gray, and black to distinguish those different clusters.

By observing Fig. 4d, it is found that there is a possibility that points that are not in the workspace are labeled in the workspace, and additional processing is needed to optimize the data distribution of each cluster. In this study, it is wanted that the points of cluster A, B, C, and F can be distributed as the following spatial characteristics: A-cluster is distributed in the interior of the workspace and have a certain distance from the boundary; C-cluster points are distributed in the workspace but very close to the boundary; B-cluster points are located in the area between A and C-cluster points; F-cluster points are mainly distributed on the boundary of the workspace or in the gaps between the workspace and the EnvGeo. To achieve the above redistribution, this study proposes the following methods using the dilation and erosion operations:

1. The first dilation process, points belong to cluster B, C, and F are operated. If the neighborhood of a point contains only points from clusters of higher rank and not points from clusters of lower rank, then the point is promoted to a cluster of one rank higher. For example, a C-cluster point, if in its neighborhood only has points from cluster P, B, and C, and doesn’t have points from cluster F or cluster N, then the points will be promoted to cluster B. However, if there are points from cluster F or cluster N in its neighborhood, or only have points from the same cluster, it will not be operated. The original cluster distribution diagram and real data are shown in Fig. 5a. After the first dilation, the result is shown in Fig. 5b.
2. The first erosion process, points from cluster A, B, and C are operated. If the neighborhood of a point contains points from clusters F or N, then it will be demoted to a cluster of one rank lower. The erosion result is shown in Fig. 5c.
3. The second dilation process, points from cluster B and C are operated. The operation rules are the same as the first dilation. The dilation result is shown in Fig. 5d.

According to Figs. 5 and 6, it can be found that after the second dilation is completed, the number of points in each cluster is constantly changing, and their spatial distribution has been greatly improved. It is worth noting that, for a particular spatial distribution in the workspace, it is possible to properly perform different times of dilation or erosion operations on specific clusters. The pseudo-code of the proposed ILDE algorithm is shown in the Appendix I.

Figure 6. The number of points in each cluster in different processes of the ILDE algorithm.

Next, P-cluster points are removed from the data due to their inhomogeneous spatial distribution. For all points in clusters A, B, C, F, and N, we take their spatial locations as features and synthesize all points as training data for supervised learning.

4. Training of the DT and RF workspace model for GPG

According to Section 2, the supervised machine learning workspace models in GPG should help it to quickly predict the location relation between an arbitrary spatial point and the leg workspace. Thus, two rules should be followed to train these models. First, as a basic function, the model should achieve high-speed multi-class classification tasks. Furthermore, to provide extra information for smarter control strategies, the model can give the predicted probability of each predicted outcome.

Based on the above two points, it is necessary to find suitable options from some commonly used supervised learning algorithms for multi-class classification. The characteristics of these algorithms are shown in Table I. K-nearest Neighbor (KNN) [Reference Zhang and Zhou30] is an algorithm whose principle is easy to understand. It determines the label of the target by determining the labels of the K-training objects adjacent to the target. At appropriate values of k, the KNN algorithm achieves high accuracy, but it requires the computation of target neighborhoods, spatial distances, and classification statistics, which are computationally complex and therefore not suitable for large-scale data prediction. The core of support vector machine (SVM) [Reference Chih-Wei and Chih-Jen31] is to find the best classification plane (i.e., hyperplane) in the sample space and separate the training samples. It has fast and stable prediction capabilities, but classical SVM algorithms are only suitable for binary classification, and multi-class classification problems require combining multiple SVMS depending on the number of predicted classes. Training multi-class classification SVM models is difficult to find optimal parameters and requires a large amount of time to tune the parameters. The Naive Bayes model (NBM) [Reference Liu, Song, Nanayakkara, Seneviratne and Althoefer32] is a probability-based classification method with strong prediction accuracy for text content or state analysis tasks, which is not suitable for the type of tasks proposed in this paper. DT [Reference Ester, Kriegel, Sander and Xu33] is an easy-understand multi-classification algorithm. The tree structure is generated according to the hierarchy, and the decision rule is to divide the feature space by a straight line parallel to the axis. The computational time complexity is modest, and the accuracy can be high by setting hyperparameters reasonably. Its disadvantage is that it is prone to overfitting and the information gain results are biased towards those features with more numerical values. The parallel prediction by using multiple DT models is called the random forest model [Reference Xu, Ester, Kriegel and Sander34], which has strong prediction accuracy, excellent anti-noise ability and is not easy to overfit. However, it should be noted that the required training time and prediction time increases when there is an excessively large number of tree models in the random forest.

Table I. Characteristics comparison of supervised machine learning algorithms commonly used for classification tasks.

Based on the comparison and the training data with a uniform spatial distribution processed in the previous section, a CART (Classification and Regression Tree) DT model is chosen preferentially in this study. It is suitable for high-dimensional data processing and has the advantages of high accuracy, rapid speed, and easy generation of stable classification rules. Then for the second rule, an RF model is trained. RF can give probabilistic predictions, which is exactly what is needed. For both models, the training environment in this paper is as follows: Python with the sklearn library; the computer processor is Intel Core i7-6600U (2.6 GHz, 2 cores and 4 threads), memory 8G. The parameters of the used functions DecisionTreeClassifier and RandomForestClassifier are shown in Tables II and III. It took 422 ms to train the DT model with 226,320 samples after stratified sampling. While for the RF model, it took 83.06 s.

Table II. The parameters of the used functions DecisionTreeClassifier.

Table III. The parameters of the used functions RandomForestClassifier.

To consider algorithms that may have potential applications, we trained a multi-class SVM model and a KNN model using the same training set (NBM model is not suitable for this task). The confusion matrix and evaluation reports for those models are shown in Fig. 7, Table IV, and Table V. The training of the SVM and KNN models is based on the training results of the DT model. The accuracy of their prediction results is as close as possible to that of the DT model, and then their other performances are compared. As can be seen in Fig. 7, the color blocks of the confusion matrix of the SVM model are irregularly distributed and it shows that its prediction results are not ideal. For the KNN, DT, and RF models, the diagonal blocks of the confusion matrix have darker colors for both models. This implies that they perform well in Recall and Precision. In Tables IV and V, the f1-score can be regarded as a harmonic average of Precision and Recall, the macro average is the average of all predicted categories, and the weighted average is the average considering the number of samples. This SVM model training process is complicated and difficult to obtain qualified prediction results. The KNN, DT, and RF models perform adequately on these criteria; however, the KNN performs slightly poor for clustering B and C. In addition, the SVM, KNN, DT, and RF models derived from Python are 2.74M, 13.5M, 1.87M, and 86.8M, respectively. The RF, SVM, and KNN models take 0.113081 s, 0.000629 s, and 0.000461 s to predict a sample, while the DT model only requires 0.0001141 s to predict a sample. The performance of DT and RF is sufficient for GPG optimization.

Figure 7. Confusion matrix of the different supervised machine learning models. (a) SVM, (b) KNN, (c) DT, (d) RF.

Table IV. Evaluation report of the SVM and KNN model.

Table V. Evaluation report of the DT and RF model.

In GPG, the basic working procedure of DT and RF is as follows. As shown in Fig. 1, when a set of spatial points (such as the environment point cloud) is passed into the decision module, the DT model is first invoked to make a rapid prediction. If this point is predicted to be in cluster A and B, it is considered to be in the interior of the workspace; If this point is predicted to be in cluster C, it is considered to be extremely close to the workspace boundary; If this point is predicted to be in cluster F or N, it is considered to be outside the workspace. In this paper, only clusters A and B are considered as reachable points in the workspace, while clusters C, F, and N are not reachable. According to this step, a preliminary prediction can be made efficiently. The prediction results can be directly sent back to the trajectory generation module as valid data to generate a trajectory.

However, if the spatial points transmitted to the decision module are critical points such as footholds or control points of leg-end trajectories, it may also be necessary to provide prediction probabilities for these points. Then, the RF model is used to predict it again. For example, for a point predicted to be in cluster B, it is desirable to know whether it is closer to A-cluster or C-cluster, so that its spatial distribution can be more accurately judged. From the RF prediction results, the probabilities of belonging to each cluster can be obtained for a point. Denoting these probabilities as P _A, P _B, P _C, P _F, and P _N, it has $P_{\textrm{A}}+P_{\textrm{B}}+P_{\textrm{C}}+P_{\textrm{F}}+P_{\textrm{N}}=1$ . If $P_{\textrm{B}}\gt P_{\textrm{A}}\gt P_{\textrm{C}}$ , the B-cluster point is considered to be close to A-cluster. While if $P_{\textrm{B}}\gt P_{\textrm{C}}\gt P_{\textrm{A}}$ , it means that the point is close to the workspace’s boundary. In later experiments, this prediction result can be used to warn the robot. Of course, for the A- and B-cluster points, a secondary validation using the PID model is also required.

5. The PID inverse kinematics model of the leg in the GPG

In Section 2, the PID method is used to construct the inverse kinematic model of the legs, in contrast to the disadvantages of multiple solutions, discontinuity of the solution, and prolonged computation time when solving the same problem using geometric methods. Based on the advantage that PID control can achieve fast and accurate solutions by flexibly adjusting the proportional control coefficients, a PID model is used to transform the points from Cartesian space to joint space to implement leg motion. Moreover, it can further validate the spatial location of the points processed from the DT model. The details are as follows:

The main idea of the PID inverse kinematics model is that, when the target position and attitude of the leg-end are given, the angle of the leg joint is continuously changed to make the current leg-end pose coincide with the desired leg-end pose, so as to obtain the inverse kinematics solution of the leg joints [Reference Luh, Walker and Paul37]. The block diagram of the control model is shown as Fig. 8. Take the leg-end of the robot’s leg as the study target, $\dot{\boldsymbol{P}}_{{d}}$ is its desired linear velocity, $\boldsymbol{P}_{{d}}$ is its desired position, $\dot{\boldsymbol{P}}_{{c}}$ is its current linear velocity, $\boldsymbol{P}_{{c}}$ is its current position. $\boldsymbol{\omega }_{{d}}$ is its angular velocity around a selected axis, $\boldsymbol{\omega }_{{c}}$ is its current angular velocity. $\dot{\boldsymbol{P}}_{{d}}, \boldsymbol{P}_{{d}}, \dot{\boldsymbol{P}}_{{c}}, \boldsymbol{P}_{{c}}, \boldsymbol{\omega }_{{d}}$ and $\boldsymbol{\omega }_{\textrm{c}}$ are all $3\times 1$ vectors. $\boldsymbol{A}_{{d}}\in \textrm{SO}(3)$ (Special orthogonal group) is the desired orientation matrix of the leg-end, while $\boldsymbol{A}_{{c}}\in \textrm{SO}(3)$ is its current orientation matrix. Write the $3\times 3$ matrix $\boldsymbol{A}_{{d}}$ and $\boldsymbol{A}_{{c}}$ as follow:

(4)

\begin{equation} \boldsymbol{A}_{{d}}=\left(\boldsymbol{s}_{{d}},\,\,\boldsymbol{n}_{{d}},\,\,\boldsymbol{a}_{{d}}\right) \end{equation}

(5)

\begin{equation} \boldsymbol{A}_{{c}}=\left(\boldsymbol{s}_{{c}},\,\,\boldsymbol{n}_{{c}},\,\,\boldsymbol{a}_{{c}}\right) \end{equation}

where s _i, n _i, a _i (i = a, d) are all $3\times 1$ vectors.

Figure 8. The block diagram of the PID model-based inverse kinematics control.

We define the position error $\boldsymbol{\varepsilon }_{{P}}$ and the orientation error $\boldsymbol{\varepsilon }_{{O}}$ as:

(6)

\begin{equation} \boldsymbol{\varepsilon }_{{P}}=\boldsymbol{P}_{{d}}-\boldsymbol{P}_{{c}} \end{equation}

(7)

\begin{equation} \boldsymbol{\varepsilon }_{{O}}=\frac{1}{2}\left(\boldsymbol{s}_{{c}}\times \boldsymbol{s}_{{d}}+\boldsymbol{n}_{{c}}\times \boldsymbol{n}_{{d}}+\boldsymbol{a}_{{c}}\times \boldsymbol{a}_{{d}}\right) \end{equation}

For this control model, the signature conditions for completing the inverse kinematics solution are as follows:

(8)

\begin{equation} \dot{\boldsymbol{\varepsilon }}_{{P}}+K_{{P}}=0,K_{{P}}\gt 0 \end{equation}

(9)

\begin{equation} \dot{\boldsymbol{\varepsilon }}_{{O}}+K_{{O}}=0,K_{{O}}\gt 0 \end{equation}

where the differential of rotation error is solved by:

(10)

\begin{equation} \dot{\boldsymbol{\varepsilon }}_{{O}}=\boldsymbol{L}^{\rm{T}}\boldsymbol{\omega }_{{d}}-\boldsymbol{L}\boldsymbol{\omega }_{{c}} \end{equation}

(11)

\begin{equation} \boldsymbol{L}=-\frac{1}{2}\left(\hat{\boldsymbol{s}}_{{c}}\hat{\boldsymbol{s}}_{{d}}+\hat{\boldsymbol{n}}_{{c}}\hat{\boldsymbol{n}}_{{d}}+\hat{\boldsymbol{a}}_{{c}}\hat{\boldsymbol{a}}_{{d}}\right) \end{equation}

where for an arbitrary $3\times 1$ vector $\boldsymbol{\sigma }=(\begin{array}{lll} \sigma _{1}, & \sigma _{2}, & \sigma _{3} \end{array})^{\rm{T}}$ , it can convert it to an antisymmetric matrix $\hat{\boldsymbol{\sigma }}$ as follows:

(12)

\begin{equation} \hat{\boldsymbol{\sigma }}=\left[\begin{array}{{ccc}} 0 & -\sigma _{3} & \sigma _{2}\\ \sigma _{3} & 0 & -\sigma _{1}\\ -\sigma _{2} & \sigma _{1} & 0 \end{array}\right] \end{equation}

$\boldsymbol{J}^{\textbf{+}}$ is the pseudo-inverse of the Jacobian matrix of the robot leg, it has the relation between the leg-end’s speed and the joint velocity:

(13)

\begin{equation} \dot{\boldsymbol{q}}=\boldsymbol{J}^{+}\dot{\boldsymbol{X}}=\boldsymbol{J}^{+}\left[\dot{\boldsymbol{P}},\,\,\boldsymbol{\omega }\right]^{\textrm{T}} \end{equation}

In gait planning, the trajectory generation module will match the orientation information of the leg-end for all trajectory points and footholds. Then combined with the position of these points, the PID inverse kinematics model can accurately solve the angle of each leg joint. However, since this model employs continuous approximations via loop calculations, it takes a long time and is not suitable for large-scale calculations. Fortunately, in the optimized GPG, only a small number of points filtered by the decision module can be sent to it for the next computation.

6. The PID inverse kinematics model of the leg in the GPG

6.1. Experiment of a real-time single-leg trajectory planning

Consider the case of a quadrupedal robot that changes its velocity during successive strides. At this point, the gait trajectory of the leg changes with the robot’s locomotion speed. For GPG, it must ensure rapid and stable gait planning, which also requires the performance of DT and RD models. Therefore, we designed the following single-leg experiment to test and observe the performance of GPG in real-time gait trajectory generation.

As depicted in Fig. 9, our experimental hardware was comprised of a computer housing a virtual model of the GPG, a servo controller, a specialized single-leg platform, and a motion capture system. In this experiment, the GPG should generate simulated gait speeds with different foot trajectories. It updates a leg-end’s trajectory based on the Bezier curve at each gait cycle, then the computer transmits the control signal to the servo controller so that the leg can alter the stride velocity during consecutive movements. When the leg is moving according to the gait trajectories generated by the GPG, the motion capture camera will recognize the position of the leg-end and help verify the speed and stability of the system under real-time instructions. The following is a detailed account of the experimental procedure.

Figure 9. The schematic illustration of the experimental scenario.

During the first gait cycle, the GPG generates an initial Bezier trajectory (shown in Fig. 10), and as shown in Fig. 11, a Bezier leg-end’s trajectory is composed of the support phase and the swing phase [Reference Hyun, Seok, Lee and Kim11]. The support phase curve is an arc, while the swing phase is determined by 12 control points (all control points are in the workspace of the leg, and are all on a same plane). For a swing phase $S^{\textrm{swing}}$ , the corresponding curve can be parameterized by:

(14)

\begin{equation} P^{\text{swing}}\left(t\right)=P^{\text{swing}}\left(S^{\text{swing}}\left(t\right)\right)=\sum _{k=0}^{n}p_{k}B_{k}^{n}\left(S^{\text{swing}}\left(t\right)\right) \end{equation}

(15)

\begin{equation} v\left(t\right)^{\text{swing}}=\frac{\textrm{d}P}{\textrm{d}S^{\text{swing}}}\frac{\textrm{d}S^{\text{swing}}}{\textrm{d}t}=\frac{1}{\hat{T}_{\text{swing}}}\frac{\textrm{d}P}{\textrm{d}S^{\text{swing}}} \end{equation}

where $P^{\text{swing}}$ is the swing position, $v(t)^{\text{swing}}$ is the swing speed, $B_{k}^{n}(S^{\text{swing}}(t))$ is the Bernstein polynomial of degree n (in this experiment n = 11), $p_{k}\in \mathcal{R}^{2}$ is the kth control point, $k\in \{0,1,\ldots,11\}, \hat{T}_{\text{swing}}$ is the leg swing period. After the trajectory equation is obtained, the spatial points on the trajectory can be obtained by interpolation according to the accuracy.

Figure 10. The logical block diagram of the experiment.

Figure 11. The leg-end’s trajectory based on Bezier curve at different speeds.

When the leg is executing the current gait, the GPG will generate a new foot trajectory for the next gait cycle. As shown in Fig. 11, in this experiment we gradually slow down the leg motion by gradually increasing the distance between p ₁ and p ₁₀. The decision module in GPG will then use DT to quickly determine if all points on the trajectory are in the workspace, while RF will determine if the two control points p ₁ and p ₁₀ are also in the workspace. If the decision module predicts that all those points are in the workspace (As the cluster A or B), the GPG will directly update the trajectory for the next gait cycle; if the decision module predicts that the control point is next to the boundary of the workspace (As cluster B but near cluster C), it will send a hint or a warning signal to the gait generation module and then update the trajectory. The gait generation module can adjust the gait update parameters (such as the update frequency) based on warnings; If the decision module predicts that there are spatial points outside or on the boundary of the workspace (As cluster C, F, or N), the GPG will retain the previous trajectory and stop updating the gait.

During the experiment, the position of the control points, the real and ideal Bezier trajectories, the duration of a gait cycle, and the warning signals are recorded. The experimental procedure and results are shown in Figs. 12 and 13. In the 1st to 4th cycle, the GPG predicted that p ₁ and p ₁₀ were all in the workspace, so it directly updated the trajectory. In the 5th to 10th cycle, p ₁ was predicted as a B-cluster point, and the RF model predicted that p ₁ was close to the boundary, so it marked p ₁ and a hint signal is sent to the gait generation module. Similarly, from the 11th to the 13th cycle, p ₁₀ was marked. In the 14th cycle, the decision module determined that p ₁ had already gotten on the workspace boundary, so it released a warning signal. For the 15th cycle, p ₁ and p ₁₀ were predicted out of the workspace, the GPG stopped updating the gait and adopted the trajectory used in the 14th cycle. Then from the 14th to the 19th cycle, the gait trajectories are the same. Except for the effect of the experimental error caused by the communication delay between modules, the accuracy of the servos, the noise, and the error of the motion capture system. The leg-end’s real trajectory recorded by the motion capture system almost coincided with the generated Bezier curve. The above results can prove the rapidity and stability of GPG in real-time computation.

Figure 12. The partial sequence diagram of the experiment. (a) t = 17.5 s, (b) t = 19.2 s, (c) t = 28.0 s, (d) t = 32.7 s, (e) t = 37.5 s, (f) t = 43.6 s, (g) t = 46.4 s, (h) t = 55.1 s.

Figure 13. The gait trajectories of the leg-end in the experiment. (a) The ideal trajectory, (b) the real trajectory.

6.2. Simulation of the locomotion of a virtual quadrupedal robot

The simulation is carried out with the virtual quadruped robot prototype in Fig. 2. It is mainly used to test the prediction speed and accuracy of our GPG when faced with large-scale data in continuous tasks. We designed three groups of simulations as follows:

Group 1: The following task scenario is built: The quadruped robot walks on an unknown ground and the GPG uses the CPG pattern to achieve a stable gait. In front of the robot’s moving direction, an obstacle (a hollow) is set on the ground. The robot is equipped with a depth sensor, but it does not process depth information during the robot walk. The robot’s leg-end trajectory uses a simple quadratic swing trajectory and a straight-line support trajectory (shown in Fig. 15a). When it encounters an obstacle, it does not take evasive action. The simulation process is shown in Fig. 16a.

Group 2: With the same scenario as in Group 1, in Group 2 the robot will use depth sensors to detect ground conditions. Based on the information collected from the depth sensors, the GPG is instructed to plan a safe and stable gait. The logic diagram of the simulation is shown in Fig. 14. Firstly, GPG generates an initial trajectory, while the depth sensor will obtain the point cloud information of the foothold area on the ground. In Fig. 2, the robot frame is x_r -y_r -z_r , the depth sensor frame is x _DS-y _DS-z _DS, the leg-end frame is x_t -y_t -z_t and the leg base frame is x_o -y_o -z_o . Spatial points can transform their coordinates in these frame through corresponding transformation matrix. For this simulation, it mainly uses:

(16)

\begin{equation} \boldsymbol{P}_{\textrm{Leg}}^{\text{cloud}}=\boldsymbol{T}_{\textrm{Leg}}^{\text{Robot}}\boldsymbol{P}_{\text{Robot}}^{\text{cloud}}=\boldsymbol{T}_{\textrm{Leg}}^{\text{Robot}}\boldsymbol{T}_{\text{Robot}}^{\textrm{DS}}\boldsymbol{P}_{\textrm{DS}}^{\text{cloud}} \end{equation}

where $\boldsymbol{P}_{\textrm{DS}}^{\text{cloud}}, \boldsymbol{P}_{\text{Robot}}^{\text{cloud}}, \boldsymbol{P}_{\textrm{Leg}}^{\text{cloud}}$ are the position of the point cloud in the depth camera, robot, and leg frame separately; $\boldsymbol{T}_{\text{Robot}}^{\textrm{DS}}$ is the transformation matrix from the depth camera to the robot; $\boldsymbol{T}_{\textrm{Leg}}^{\text{Robot}}$ is the transformation matrix from robot to leg. When the point cloud of the environment is transferred to the GPG, GPG will directly use the inverse kinematic PID model to compute whether a pre-selected foothold area is reliable (the Cartesian position of the point cloud is reachable for leg-end). For all the points in the area, the PID model should find its inverse solution within a control period. If an obstacle is encountered, GPG determines an adjusted foothold point based on the geometry of the current foothold area and generates a fresh leg-end trajectory based on its computed results (shown as Fig. 15b). The simulation process is shown in Fig. 16b.

Figure 14. The logical block diagram of the Group 2, Group 3, and Group 4 simulation.

Figure 15. The leg-end’s trajectories in simulation. (a) Trajectory on normal ground, (b) adjusted trajectory when encountering obstacle.

Figure 16. The sequence diagram of the simulation. (a) Group 1, (b) Group 2, (c) Group 3, (d) Group 4.

Group 3: This is the control group for Group 2. Since the performance of the KNN model in Section 4 has potential applications, it needs to be tested. In this group, it uses the KNN model to help the GPG predict whether the pre-selected foothold area is reliable. If the area is reliable (most of the points in the area are in the workspace), the GPG will perform the original trajectory; if the pre-selected foothold area is unreliable (most of the points in the area belong to cluster C, F, or N), the GPG will reselect the foothold according to the point cloud and readjust the leg-end’s trajectory. The simulation process is shown in Fig. 16c.

Group 4: In this group, to verify the method proposed in this paper, the virtual robot calls the decision module in the optimized GPG and uses the DT model to predict whether the pre-selected foothold area is reliable (The discriminant basis is the same as in Group 3.). This group needs to be compared with Group 1 to verify that the decision module can adjust the trajectory in real time. The simulation process is shown in Fig. 16d.

The simulation environments for all groups were built using Coppeliasim. The computer processor is Intel Core i5-12400F CPU (4.4 GHz, 6 cores and 12 threads). The GPG, motion controller, and PID model were programmed by MATLAB. During the simulation, time is counted, point clouds detected by depth sensors are collected, and the actual walking gait is recorded. The above simulation is repeated 15 times, the corresponding simulation results are shown in Figs. 17, 18, and 19.

Figure 17. The sequence diagram of the simulation. (a) Group 1, (b) Group 2, (c) Group 3, (d) Group 4.

Figure 18. The boxplot of time taken by GPG to process the point clouds. (a) Using the PID inverse kinematics model in Group 2, (b) using the KNN model in Group 3, (c) using the decision module (with DT model) in Group 4.

Figure 19. The point cloud data processing results by GPG. (a) Using the PID inverse kinematics model in Group 2, (b) using the KNN model in Group 3, (c) using the decision module (with DT model) in Group 4.

From Fig. 16 and 17, it can be found that GPG is able to stop the robot in front of the hollow using KNN, DT, and PID models. However, it is clear that in Group 2, the speed at which the PID model processes the point cloud data gravely affects the locomotion continuity of the quadrupedal robot. In contrast to Group 2, the KNN model used in Group 3 is considerably faster but still has some latency. It is important to note that the DT model in the decision module used in Group 4 did not affect the robot’s normal walk; in Fig. 18, we calculate in detail the time taken by Groups 2, 3, and 4 to process the point cloud data with 1024 points. The prediction speed of the DT model in the optimized GPG is about 100 times faster and 4 times faster than the processing speed of the PID model and the KNN model. In addition, in Fig. 19a, it is found that using PID model can only divide the point cloud data of the ground into reachable and unreachable points. In contrast, using the KNN and DT model can quickly divide the points into multiple clusters according to their spatial distribution characteristics (like shown in Fig. 19b, c). However, the DT model can produce predictions with a more distinct hierarchy of spatial distributions for different clusters. In this respect, the DT model can provide reliable additional information for gait planning.

Overall, with the DT and RF workspace models, the optimized GPG performs well in both simulations and experiments.

7. Discussion

Combining the functionality of the optimized GPG based on the DT and RF models, the SWOT analysis and discussion is carried out in this study.

Strengths: (1) The ILDE data labeling algorithm proposed in this paper is easy to understand and has wide adaptability to various legged robots. (2) The DT model has a small time complexity and is good at rapidly processing a large amount of data. It does not affect the task processing efficiency of the robot’s GPG and facilitates the optimization of tasks such as real-time gait planning. (3) The DT and RF models have excellent multi-output prediction ability, which is conducive to provide more abundant prediction information and provide a basis for expanding other functions of the optimized GPG. (4) The DT and RF models are modest in size and only need to be built once, which is easy to reuse in embedded systems.

Weaknesses: (1) Prediction error will increase if overly numerous clusters are predicted. (2) When the number of samples in each cluster is inconsistent, the predicted results of the DT model will be biased towards those clusters with more samples, which will cause errors. (3) When there are too many points in the original data set, using the proposed ILDE algorithm to generate the training set consumes additional computing resources and time.

Opportunities: (1) Leg mechanisms with different joint numbers and sizes can adopt the proposed method to generate their own workspace models described by supervised learning algorithms. (2) Currently, a large number of legged robots use structured light or laser sensors, and the proposed method is suitable for point cloud data processing with a large number of spatial points. (3) For the bionic robot, the real-time and high speed of gait or leg-end trajectory planning is required. (4) The robot-embedded control system has lightweight requirements for the volume occupied by the algorithm code itself.

Threats: (1) The updating and maintenance of the workspace model need to be implemented in an external computer. (2) The optimization method in this paper is suitable for rapid screening of a large number of spatial points and rapid warning of leg-end trajectory changes, but it is not suitable for accurate determination of workspace boundary points.

First, we analyze the weaknesses and threats of the method: If a large amount of point cloud data is initially processed, the error in the prediction result of the workspace model is within an acceptable range. If the DT model is used for the task of changing foot trajectory points, for points predicted as class A and B, the strategy mentioned in this paper of using the PID algorithm for their secondary verification can be effective in improving system security. Training the ILDE algorithm to generate the training set requires significant computational resources, so external computers with stronger performance can be used to meet the demand. For problems where the workspace needs to be maintained and updated externally, the update frequency is low, which is acceptable in practical applications. Moreover, our proposed method also has obvious advantages: our method focuses on solving the rough gait planning problem and adopts the classical supervised learning algorithms to optimize the GPG of the robot in an innovative way. The models are fast, their sizes are modest, and the accuracy meets the needs of preliminary data processing in GPG.

8. Conclusion

In this paper, we propose an optimized GPG based on DT and RF models of the robot’s leg workspace. The framework of the GPG and the modules associated with it in the robot system are illustrated. To train the leg workspace model described by DT and RF used in GPG, the original data collection method is introduced. An ILDE data processing algorithm is also presented. Among the labeled points, clusters A, B, C, F, and D are used to describe the position relation with respect to the workspace. For the trained DT and RF models, their performance is evaluated. Based on distributional features, these models can easily predict the location relation between a spatial point and the leg workspace. Similar multi-class classification supervised learning models such as SVM and KNN models are also trained and compared with them. The DT model takes only 0.00011 s to process a sample, while the RF model can give the prediction probability. Moreover, the DT model is only 1.87M in size, which is friendly for embedded control. Finally, the optimized GPG is tested by a real-time single-leg trajectory planning experiment and an unknown terrain emergency handling simulation of a virtual quadrupedal robot. According to the experiment and simulation, GPG can be implemented to perform well in assisting tasks that require large-scale data processing, such as leg-end’s trajectory planning and environmental point cloud analysis. Compared to using the PID model alone, our proposed optimized GPG with decision module is 100 times faster to process the environment point cloud. All the results and SWOT analysis can demonstrate the potential applications of the optimized GPG for quadrupedal robot control.

Author contributions

Y.W. designed the study, collected the data, and wrote the manuscript. S.G. and Z.Y. analyzed the data and provided critical revisions to the manuscript. P.Y. assisted with data collection. L.N. and M.S. assisted with manuscript preparation.

Financial support

This work was supported by the National Natural Science Foundation of China (Grant No. 52275004).

Competing interests

The authors declare no competing interest exist.

Ethical approval

Not applicable.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/S0263574723001261

Appendix I

References

Nguyen, Q., Powell, M. J., Katz, B., Carlo, J. D. and Kim, S., “Optimized Jumping on the MIT Cheetah 3 Robot,” In: 2019 International Conference on Robotics and Automation (ICRA) (2019) pp. 7448–7454. doi: 10.1109/ICRA.2019.8794449.CrossRef Google Scholar

Katz, B., Carlo, J. D. and Kim, S., “Mini Cheetah: A Platform for Pushing the Limits of Dynamic Quadruped Control,” In: 2019 International Conference on Robotics and Automation (ICRA) (2019) pp. 6295–6301. doi: 10.1109/ICRA.2019.8793865.CrossRef Google Scholar

Arm, P., Zenkl, R., Barton, P., Beglinger, L., Dietsche, A., Ferrazzini, L., Hampp, E., Hinder, J., Huber, C., Schaufelberger, D., Schmitt, F., Sun, B., Stolz, B., Kolvenbach, H. and Hutter, M., “SpaceBok: A Dynamic Legged Robot for Space Exploration,” In: 2019 International Conference on Robotics and Automation (ICRA) (2019) pp. 6288–6294. doi: 10.1109/ICRA.2019.8794136.CrossRef Google Scholar

Hutter, M., Gehring, C., Lauber, A., Gunther, F., Bellicoso, C. D., Tsounis, V., Fankhauser, P., Diethelm, R., Bachmann, S., Bloesch, M., Kolvenbach, H., Bjelonic, M., Isler, L. and Meyer, K., “ANYmal - toward legged robots for harsh environments,” Adv. Robot. 31(17), 918–931 (2017). doi: 10.1080/01691864.2017.1378591.CrossRef Google Scholar

Buchanan, R., Wellhausen, L., Bjelonic, M., Bandyopadhyay, T., Kottege, N. and Hutter, M., “Perceptive whole-body planning for multilegged robots in confined spaces,” J. Field Robot. 38(1), 68–84 (2020). doi: 10.1002/rob.21974.CrossRef Google Scholar

Fukuoka, Y., Kimura, H., Hada, Y. and Takase, K., “Adaptive Dynamic Walking of a Quadruped Robot ‘Tekken’ on Irregular Terrain Using a Neural System Model,” In: 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422) (2003) pp. 2037–2042. doi: 10.1109/ROBOT.2003.1241893.CrossRef Google Scholar

Kimura, H. and Fukuoka, Y., “Biologically Inspired Adaptive Dynamic Walking in Outdoor Environment Using a Self-Contained Quadruped Robot: ‘Tekken2,” In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566) (2004) pp. 986–991. doi: 10.1109/IROS.2004.1389481.CrossRef Google Scholar

Raibert, M. H. and Tello, E. R., “Legged robots that balance,” IEEE Expert 1(4), 89–89 (1986). doi: 10.1109/MEX.1986.4307016.CrossRef Google Scholar

Yang, J.-M., “Two-phase discontinuous gaits for quadruped walking machines with a failed leg,” Robot. Auton. Syst. 56(9), 728–737 (2008). doi: 10.1016/j.robot.2008.01.002.CrossRef Google Scholar

Omori, Y., Kojio, Y., Ishikawa, T., Kojima, K., Sugai, F., Kakiuchi, Y., Okada, K. and Inaba, M., “Autonomous Safe Locomotion System for Bipedal Robot Applying Vision and Sole Reaction Force to Footstep Planning,” In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2019) pp. 4891–4898. doi: 10.1109/IROS40897.2019.8968196.CrossRef Google Scholar

Hyun, D. J., Seok, S., Lee, J. and Kim, S., “High speed trot-running: Implementation of a hierarchical controller using proprioceptive impedance control on the MIT Cheetah,” Int. J. Robot. Res. 33(11), 1417–1445 (2014). doi: 10.1177/0278364914532150.CrossRef Google Scholar

Zeng, X., Zhang, S., Zhang, H., Li, X., Zhou, H. and Fu, Y., “Leg trajectory planning for quadruped robots with high-speed trot gait,” Appl. Sci. 9(7), 1508 (2019). doi: 10.3390/app9071508.CrossRef Google Scholar

Fankhauser, P., Bjelonic, M., Bellicoso, C. D., Miki, T. and Hutter, M., “Robust Rough-Terrain Locomotion with a Quadrupedal Robot,” In: 2018 IEEE International Conference on Robotics and Automation (ICRA) (2018) pp. 5761–5768. doi: 10.1109/ICRA.2018.8460731.CrossRef Google Scholar

Tieck, J. C. V., Rutschke, J., Kaiser, J., Schulze, M., Buettner, T., Reichard, D., Roennau, A. and Dillmann, R., “Combining Spiking Motor Primitives with a Behaviour-Based Architecture to Model Locomotion for Six-Legged Robots,” In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2019) pp. 4161–4168. doi: 10.1109/IROS40897.2019.8968128.CrossRef Google Scholar

Chen, Z. and Gao, F., “Time-optimal trajectory planning method for six-legged robots under actuator constraints,” Proc. Inst. Mech. Eng. C. J. Mech. Eng. Sci. 233(14), 4990–5002 (2019). doi: 10.1177/0954406219833077.CrossRef Google Scholar

Kim, Y., Son, B. and Lee, D., “Learning multiple gaits of quadruped robot using hierarchical reinforcement learning,” arXiv e-prints (2021). doi: 10.48550/arXiv.2112.04741.Google Scholar

Shao, Y., Jin, Y., Liu, X., He, W., Wang, H. and Yang, W., “Learning free gait transition for quadruped robots via phase-guided controller,” IEEE Robot. Autom. Lett. 7(2), 1230–1237 (2022). doi: 10.1109/LRA.2021.3136645.CrossRef Google Scholar

Li, X., Gao, H., Li, J., Wang, Y. and Guo, Y., “Hierarchically planning static gait for quadruped robot walking on rough terrain,” J. Robot. 2019, 1–12 (2019). doi: 10.1155/2019/3153195.CrossRef Google Scholar

Takei, Y., Morishita, K., Tazawa, R., Katsuya, K. and Saito, K., “Non-programmed gait generation of quadruped robot using pulse-type hardware neuron models,” Artif. Life Robot. 26(1), 109–115 (2021). doi: 10.1007/s10015-020-00637-z.CrossRef Google Scholar

Gupta, K. C., “On the nature of robot workspace,” Int. J. Robot. Res. 5(2), 112–121 (1986). doi: 10.1177/027836498600500212.CrossRef Google Scholar

Kumar, A. and Waldron, K. J., “The workspaces of a mechanical manipulator,” J. Mech. Des. 103(3), 665–672 (1981). doi: 10.1115/1.3254968.Google Scholar

Ricard, R. and Gosselin, C. M., “On the determination of the workspace of complex planar robotic manipulators,” J. Mech. Des. 120(2), 269–278 (1998). doi: 10.1115/1.2826968.CrossRef Google Scholar

Snyman, J. A., du Plessis, L. J. and Duffy, J., “An optimization approach to the determination of the boundaries of manipulator workspaces,” J. Mech. Des. 122(4), 447–456 (2000). doi: 10.1115/1.1289388.CrossRef Google Scholar

Cao, Y., Qi, S., Lu, K., Zang, Y. and Yang, G., “An Integrated Method for Workspace Computation of Robot Manipulator,” In: 2009 International Joint Conference on Computational Sciences and Optimization (2009) pp. 309–312. doi: 10.1109/CSO.2009.161.CrossRef Google Scholar

Benbrahim, H. and Franklin, J. A., “Biped dynamic walking using reinforcement learning,” Robot. Auton. Syst. 22(3), 283–302 (1997). doi: 10.1016/S0921-8890(97)00043-2.CrossRef Google Scholar

Wang, S., Chaovalitwongse, W. and Babuska, R., “Machine learning algorithms in bipedal robot control,” IEEE Trans. Syst. Man Cybern. C Appl. Rev. 42(5), 728–743 (2012). doi: 10.1109/TSMCC.2012.2186565.CrossRef Google Scholar

Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V. and Hutter, M., “Learning agile and dynamic motor skills for legged robots,” Sci. Robot. 4(26), (2019). doi: 10.1126/scirobotics.aau5872.CrossRef Google Scholar PubMed

Pratt, J., Chew, C.-M., Torres, A., Dilworth, P. and Pratt, G., “Virtual model control: An intuitive approach for bipedal locomotion,” Int. J. Robot. Res. 20(2), 129–143 (2001). doi: 10.1177/02783640122067309.CrossRef Google Scholar

Song, Y., Zuo, J., Wu, J., Liu, Z. and Li, Z., “Robot Perceptual Classification Method Based on Mixed Features of Decision Tree and Random Forest,” In: 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE) (2021) pp. 919–922. doi: 10.1109/ICBAIE52039.2021.9389973.CrossRef Google Scholar

Zhang, M.-L. and Zhou, Z.-H., “ML-KNN: A lazy learning approach to multi-label learning,” Pattern Recogn. 40(7), 2038–2048 (2007). doi: 10.1016/j.patcog.2006.12.019.CrossRef Google Scholar

Chih-Wei, H. and Chih-Jen, L., “A comparison of methods for multiclass support vector machines,” IEEE Trans. Neural Netw. 13(2), 415–425 (2002). doi: 10.1109/72.991427.CrossRef Google Scholar

Liu, H., Song, X., Nanayakkara, T., Seneviratne, L. D. and Althoefer, K., “A Computationally Fast Algorithm for Local Contact Shape and Pose Classification Using a Tactile Array Sensor,” In: 2012 IEEE International Conference on Robotics and Automation (2012) pp. 1410–1415. doi: 10.1109/ICRA.2012.6224872.CrossRef Google Scholar

Ester, M., Kriegel, H.-P., Sander, J. and Xu, X., “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise,” In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (AAAI Press, Portland, OR, 1996) pp. 226–231.Google Scholar

Xu, X., Ester, M., Kriegel, H.-P. and Sander, J., “A Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases,” In: Proceedings 14th International Conference on Data Engineering (1998) pp. 324–331. doi: 10.1109/ICDE.1998.655795.Google Scholar

Breiman, L., Classification and Regression Trees (Routledge, Boca Raton, 1984). doi: 10.1201/9781315139470.Google Scholar

Breiman, L., “Random forests,” Mach. Learn. 45(1), 5–32 (2001). doi: 10.1023/A:1010933404324.CrossRef Google Scholar

Luh, J., Walker, M. and Paul, R., “Resolved-acceleration control of mechanical manipulators,” IEEE Trans. Autom. Control 25(3), 468–474 (1980). doi: 10.1109/TAC.1980.1102367.CrossRef Google Scholar

Figure 1. The partial robot system framework with the GPG.

Figure 2. The prototype of a quadruped robot equipped with a depth camera (left) and the model of its right front leg (right).

Figure 3. The original data points sampled from the workspace. (a) Sampling interval: 0.4 rad, (b) Sampling interval: 0.2 rad, (c) Sampling interval: 0.1 rad, (d) Sampling interval: 0.05 rad.

Figure 6. The number of points in each cluster in different processes of the ILDE algorithm.

Table I. Characteristics comparison of supervised machine learning algorithms commonly used for classification tasks.

Table II. The parameters of the used functions DecisionTreeClassifier.

Table III. The parameters of the used functions RandomForestClassifier.

Figure 7. Confusion matrix of the different supervised machine learning models. (a) SVM, (b) KNN, (c) DT, (d) RF.

Table IV. Evaluation report of the SVM and KNN model.

Table V. Evaluation report of the DT and RF model.

Figure 8. The block diagram of the PID model-based inverse kinematics control.

Figure 9. The schematic illustration of the experimental scenario.

Figure 10. The logical block diagram of the experiment.

Figure 11. The leg-end’s trajectory based on Bezier curve at different speeds.

Figure 12. The partial sequence diagram of the experiment. (a) t = 17.5 s, (b) t = 19.2 s, (c) t = 28.0 s, (d) t = 32.7 s, (e) t = 37.5 s, (f) t = 43.6 s, (g) t = 46.4 s, (h) t = 55.1 s.

Figure 13. The gait trajectories of the leg-end in the experiment. (a) The ideal trajectory, (b) the real trajectory.

Figure 14. The logical block diagram of the Group 2, Group 3, and Group 4 simulation.

Figure 15. The leg-end’s trajectories in simulation. (a) Trajectory on normal ground, (b) adjusted trajectory when encountering obstacle.

Figure 16. The sequence diagram of the simulation. (a) Group 1, (b) Group 2, (c) Group 3, (d) Group 4.

Figure 17. The sequence diagram of the simulation. (a) Group 1, (b) Group 2, (c) Group 3, (d) Group 4.

Wu et al. supplementary material

Video 37.3 MB

Article contents

Design of an optimized gait planning generator for a quadruped robot using the decision tree and random forest workspace model

Abstract

Keywords

1. Introduction

2. Overview of the GPG

3. Data collection and labeling for machine learning

3.1. Original points in the workspace

3.2. Interpolation and labeling of the original data

4. Training of the DT and RF workspace model for GPG

5. The PID inverse kinematics model of the leg in the GPG

6. The PID inverse kinematics model of the leg in the GPG

6.1. Experiment of a real-time single-leg trajectory planning

6.2. Simulation of the locomotion of a virtual quadrupedal robot

7. Discussion

8. Conclusion

Author contributions

Financial support

Competing interests

Ethical approval

Supplementary material

Appendix I

References

Wu et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests