Hostname: page-component-7c8c6479df-ph5wq Total loading time: 0 Render date: 2024-03-29T14:45:57.031Z Has data issue: false hasContentIssue false

Optimal motion planning by reinforcement learning in autonomous mobile vehicles

Published online by Cambridge University Press:  19 May 2011

M. Gómez*
Affiliation:
Departamento de Automática, Escuela Politécnica Superior, Universidad de Alcalá, Campus Universitario, 28871 Alcalá de Henares, Madrid, Spain
R. V. González
Affiliation:
Departamento de Automática, Escuela Politécnica Superior, Universidad de Alcalá, Campus Universitario, 28871 Alcalá de Henares, Madrid, Spain
T. Martínez-Marín
Affiliation:
Departamento de Física, Ingeniería de Sistemas y Teoría de la Señal, Universidad de Alicante, 03080 Alicante, Spain
D. Meziat
Affiliation:
Departamento de Automática, Escuela Politécnica Superior, Universidad de Alcalá, Campus Universitario, 28871 Alcalá de Henares, Madrid, Spain
S. Sánchez
Affiliation:
Departamento de Automática, Escuela Politécnica Superior, Universidad de Alcalá, Campus Universitario, 28871 Alcalá de Henares, Madrid, Spain
*
*Corresponding author. E-mail: mgomez@aut.uah.es

Summary

The aim of this work has been the implementation and testing in real conditions of a new algorithm based on the cell-mapping techniques and reinforcement learning methods to obtain the optimal motion planning of a vehicle considering kinematics, dynamics and obstacle constraints. The algorithm is an extension of the control adjoining cell mapping technique for learning the dynamics of the vehicle instead of using its analytical state equations. It uses a transformation of cell-to-cell mapping in order to reduce the time spent during the learning stage. Real experimental results are reported to show the satisfactory performance of the algorithm.

Type
Articles
Copyright
Copyright © Cambridge University Press 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Hsu, C. S. and Guttalu, R. S., “An unravelling algorithm for global analysis of dynamical systems: An application of cell-to-cell mapping,” J. Appl. Mech. 44, 940948 (1980).CrossRefGoogle Scholar
2.Reeds, J. A. and Shepp, R. A., “Optimal paths for a car that goes both forwards and backwards,” Pac. J. Math. 145 (2), 364393 (1990).CrossRefGoogle Scholar
3.Latombe, J.-C., Robot Motion Planning (Kluwer Academic, USA, 1991).CrossRefGoogle Scholar
4.Gómez, M., Martínez, T. and Sánchez, S., “Optimal Trajectory Generation Using the Simple Cell-Mapping Method for Wheeled Mobile Vehicles,” Proceedings of Seminario Annual de Automática, Electrónica Industrial e Instrumentación (SAAEI), Universidad de Alcalá, Spain (2002) pp. 15.Google Scholar
5.Lamiraux, F. and Laumond, J. P., “Smooth motion planning for car-like vehicles,” IEEE Trans. Robot. Autom. 14 (4), 498502 (2001).CrossRefGoogle Scholar
6.Rankin, L. and Crane, C. D. III, “A Multi-purpose off-line Path Planner Based on an A* Search Algorithm,” Proceedings of the ASME Design Engineering Technical Conferences, Irvine, California (1996) pp. 110.Google Scholar
7.Belkhouche, F., “Reactive path planning in a dynamic environment,” IEEE Trans. Robot. 25 (4), 902911 (2009).CrossRefGoogle Scholar
8.Qin, B., Soh, Y. C., Xie, M. and Wang, D., “Optimal Trajectory Generation for Wheeled Mobile Robot,” Proceedings of the 5th International Conference on Computer Integrated Manufacturing, Singapore (2000) vol. 1, pp. 434444.Google Scholar
9.Qin, B., Soh, Y. C., Wang, D. and Xie, M., “Trajectory Generation for Velocity-Varying Wheeled Mobile Robot,” Proceedings of the 6th International Conference on Applications of Advanced Technologies in Transportation Engineering, Singapore (2000).Google Scholar
10.Wang, H., Chen, Y. and Souères, P., “A geometric algorithm to compute time-optimal trajectories for a bidirectional steered robot,” IEEE Trans. Robot. 25 (2), 399413 (2009).CrossRefGoogle Scholar
11.Morin, P. and Samson, C., “Control of nonholonomic mobile robots based on the transverse function approach,” IEEE Trans. Robot. 25 (5), 10581073 (2009).CrossRefGoogle Scholar
12.Fraichard, T. and Ahuactzin, J. M., “Smooth Path Planning for Cars,” Proceedings of the IEEE International Conference on Robotics and Automation, Seoul, Korea (2001) pp. 37223727.Google Scholar
13.Barraquand, J. and Latombe, J. C., “On nonholonomic mobile robots and optimal maneuvering,” Revue d'Inteligence Artificielle 3 (2), 44103 (1989).Google Scholar
14.Krödel, M. and Kuhnert, K-D., “Reinforcement Learning to Drive a Car by Pattern Matching,” Computer Science, Proceedings of the 24th DAGM Symposium on Pattern Recognition, Lecture Notes in Computer Science, vol. 2449, Springer, Berlin (2002), pp. 322329.Google Scholar
15.Brunskill, E., Leffler, B. R., Li, L., Littman, M. L. and Roy, N., “Provably efficient learning with typed parametric models,” J. Mach. Learn. Res. 10, 19551988 (2009).Google Scholar
16.Hsu, S., “A discrete method of optimal control based upon the cell state space concept,” J. Optim. Theory Appl. 46 (4), 547569 (1985).CrossRefGoogle Scholar
17.Papa, M., Tai, H. M. and Shenoi, S., “Cell Mapping for controller design and evaluation,” IEEE Control Syst. Mag. 17, 5265 (1997).Google Scholar
18.Martínez-Marín, T., Diseño de controladores óptimos combinando técnicas de Cell-Mapping y redes neuronales para el control de sistemas dinámicos no lineales Ph.D. Dissertation (Madrid, Spain: Escuela Técnica Superior de Ingenieros de Telecomunicación (ETSIT), Universidad Politécnica de Madrid, 1999).Google Scholar
19.Song, F. and Smith, S. M., “Cell-state-space-based search,” IEEE Control Syst. Mag. 22 (4), 4256 (2002).Google Scholar
20.Zufiria, P. J. and Martínez-Marín, T., “Improved optimal control methods based upon the adjoining cell mapping technique,” J. Optim. Theory Appl. 118 (3), 654680 (2003).CrossRefGoogle Scholar
21.Gómez, M., Martínez, T., Sánchez, S. and Meziat, D., “Optimal control applied to Wheeled Mobile Vehicles,” Proceedings of the IEEE International Symposium on Intelligent Signal Processing, Universidad de Alcalá, Spain (2007) pp. 8388.Google Scholar
22.Gómez, M., Martínez-Marín, T., Sánchez, S. and Meziat, D., “Optimal Control Applied for Wheeled Mobile Vehicles Based on Cell Mapping Techniques,” Proceedings of the IEEE Intelligent Vehicles Symposium, Eindhoven University of Technology, The Netherlands (2008) pp. 10091014.Google Scholar
23.Hsieh, M. F. and Özgüner, Ü., “A Parking Algorithm for Autonomous Vehicle,” Proceedings of the IEEE Intelligent Vehicles Symposium, Eindhoven University of Technology, The Netherlands (2008) pp. 11551160.Google Scholar
24.Jung, H. G., Kim, D. S., Yoon, P. J. and Kim, J., “Two-Touch Type Parking Slot Marking Recognitions for Target Parking Position Designation,” Proceedings of the IEEE Intelligent Vehicles Symposium, Eindhoven University of Technology, The Netherlands (2008) pp. 11611166.Google Scholar
25.Fu, L., Yazici, A. and Özgüner, Ü., “Route Planning for OSU-ACT Autonomous Vehicle in DARPA Urban Challenge,” Proceedings of the IEEE Intelligent Vehicles Symposium, Eindhoven University of Technology, The Netherlands (2008) pp. 781786.Google Scholar
26.Thrun, S., Montemerlo, M., Dahlkamp, H., Stavens, D., Aron, A., Diebel, J., Fong, P., Gale, J., Halpenny, M., Hoffmann, G., Lau, K., Oakley, C., Palatucci, M., Pratt, V., Stang, P., Strohband, S., Dupont, C., Jendrossek, L.-E., Koelen, C., Markey, C., Rummel, C., van Niekerk, J., Jensen, E., Alessandrini, P., Bradski, G., Davies, B., Ettinger, S., Kaehler, A., Nefian, A. and Mahoney, P., “Stanley, the robot that won the DARPA grand challenge,” J. Field Robot. 23 (9), 661692 (2006).CrossRefGoogle Scholar
27.Kushner, H. J. and Dupuis, P., “Numerical Methods for Stochastic Control Problems in Continuous Time,” In: Applications of Mathematics (Springer-Verlag, Berlin, 1992).Google Scholar
28.Munos, R. and Moore, A., “Variable Resolution Discretization in Optimal Control,” Mach. Learn. 49, 291323 (2002).CrossRefGoogle Scholar
29.Gómez, M., Martínez-Marín, T., Sánchez, S. and Meziat, D., “Integration of cell-mapping and reinforcement learning techniques for motion planning of car-like robots,” IEEE Trans. Instrum. Meas. (Special Issue) 58 (9), 30943103 (2009).CrossRefGoogle Scholar
30.Gómez, M., Gayarre, L., Martínez-Marín, T., Sánchez, S. and Meziat, D., “Motion Planning of a Non-Holonomic Vehicle in a Real Environment by Reinforcement Learning,” Proceedings of the International Work-Conference on Artificial Neural Networks (IWANN '09), Lecture Notes in Computer Science, vol. 5514, Springer, Berlin (2009), pp. 813819.Google Scholar
31.Sutton, R. S., “Reinforcement Learning Architectures,” Proceedings of the International Symposium on Neural Information Processing (ISKIT '92), Fukuoka, Japan (1992).Google Scholar
32.Sutton, R. S. and Barto, A., Reinforcement Learning: An Introduction (MIT Press, USA, 1998).Google Scholar
33.Zufiria, P. J. and Guttalu, R. S., “The adjoining cell mapping and its recursive unravelling. Part I: Description of adaptive and recursive algorithms,” Nonlinear Dyn. 4 (4), 204226 (1993).CrossRefGoogle Scholar
34.Watkins, C. J. C. H., Learning from Delayed Rewards Ph.D. Dissertation (Cambridge, England: Cambridge University, 1989).Google Scholar
35.Kaebling, L. P., Littman, M. L. and Moore, A. W., “Reinforcement learning: A survey,” Artif. Intell. Res. 4, 234285 (1996).Google Scholar
36.Watkins, C. J. C. H. and Peter, D., “Technical note: Q-learning,” Mach. Learn. 8, 249292 (1992).CrossRefGoogle Scholar
37.Bertsekas, D. P. and Tsitsiklis, J., Neuro-Dynamic Programming (Athenea Scientific, USA, 1996).Google Scholar
38.Bellman, R. E., Dynamic programming (Princeton University Press, USA, 1954).Google ScholarPubMed
39.Moore, and Atkeson, C., “Prioritized sweeping: Reinforcement learning with less data and less time,” Mach. Learn. 13, 103130 (1993).CrossRefGoogle Scholar
40.Sutton, R. S., “First Results with Dyna, an Integrated Architecture for Learning, Planning, and Reacting,” In: Neural Networks for Control (MIT Press, 1990).Google Scholar
41.Gómez, M., “Website of Research Works Related to Motion Optimal Planning and Reinforcement Learning Applied to Autonomous Mobile Vehicles,” Videos Online: http://atc1.aut.uah.es/~mariano/Research/OptimalControl_research.html (2010).Google Scholar