Bibliography (3/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

50 BIBLIOGRAPHY

[53] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir

Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper

with convolutions. In Proc. of the IEEE Conference on Computer Vision and Pattern Recog-

nition, pages 1–9, 2015. DOI: 10.1109/cvpr.2015.7298594 9

[54] Sergey Ioﬀe and Christian Szegedy. Batch normalization: Accelerating deep network

training by reducing internal covariate shift. ArXiv Preprint ArXiv:1502.03167, 2015.

[55] Quentin J. M. Huys, Anthony Cruickshank, and Peggy Seriès. Reward-based learning,

model-based and model-free. In Encyclopedia of Computational Neuroscience, pages 2634–

2641, Springer, 2015. DOI: 10.1007/978-1-4614-6675-8_674 11

[56] R. S. Sutton and A. G. Barto.

Reinforcement Learning: An Introduction

, vol. 9, MIT Press,

Cambridge, MA, 1998. DOI: 10.1109/tnn.1998.712192 11, 13, 20

[57] Marco Wiering and Martijn Van Otterlo. Reinforcement learning. Adaptation, Learning,

and Optimization, 12:51, 2012. DOI: 10.1007/978-3-642-27645-3

[58] Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil Anthony Bharath.

Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine, 34(6):26–

38, 2017. DOI: 10.1109/msp.2017.2743240 11

[59] Vijay R. Konda and John N. Tsitsiklis. Onactor-critic algorithms. SIAM Journal on Control

and Optimization, 42(4):1143–1166, 2003. DOI: 10.1137/s0363012901385691 12

[60] Christopher John Cornish Hellaby Watkins. Learning from delayed rewards. Ph.D. thesis,

King’s College, Cambridge, 1989. 12

[61] Christopher J. C. H. Watkins and Peter Dayan. Q-learning. Machine Learning, 8(3–

4):279–292, 1992. DOI: 10.1007/bf00992698

[62] Steven J. Bradtke, B. Erik Ydstie, and Andrew G. Barto. Adaptive linear quadratic control

using policy iteration. In Proc. of the American Control Conference, vol. 3, pages 3475–3475,

Citeseer, 1994. DOI: 10.1109/acc.1994.735224 12

[63] Gavin A. Rummery and Mahesan Niranjan. On-Line Q-Learning Using Connectionist

Systems, vol. 37, University of Cambridge, Department of Engineering Cambridge, UK,

1994. 12

[64] Ivo Grondman, Lucian Busoniu, Gabriel A. D. Lopes, and Robert Babuska. A survey of

actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Transac-

tions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6):1291–1307,

2012. DOI: 10.1109/tsmcc.2012.2218595 12

BIBLIOGRAPHY 51

[65] Leemon Baird. Residual algorithms: Reinforcement learning with function approxima-

tion. In Machine Learning Proceedings, pages 30–37, Elsevier, 1995. DOI: 10.1016/b978-

1-55860-377-6.50013-x

[66] Geoﬀrey J. Gordon. Stable function approximation in dynamic programming. In Machine

Learning Proceedings, pages 261–268, Elsevier, 1995. DOI: 10.1016/b978-1-55860-377-

6.50040-2

[67] John N. Tsitsiklis and Benjamin Van Roy. Feature-based methods for large scale dy-

namic programming. Machine Learning, 22(1–3):59–94, 1996. DOI: 10.1007/978-0-585-

33656-5_5 12

[68] Ronald J. Williams. Reinforcement-Learning Connectionist Systems. College of Computer

Science, Northeastern University, 1987. 12

[69] R. Williams. A class of gradient-estimation algorithms for reinforcement learning in neu-

ral networks. In Proc. of the International Conference on Neural Networks, pages II–601,

1987.

[70] Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist

reinforcement learning. Machine Learning, 8(3–4):229–256, 1992. DOI: 10.1007/978-1-

4615-3618-5_2 12

[71] Richard S. Sutton, David A. McAllester, Satinder P. Singh, and Yishay Mansour. Policy

gradient methods for reinforcement learning with function approximation. In Advances in

Neural Information Processing Systems, pages 1057–1063, 2000. 12

[72] Martin Riedmiller, Jan Peters, and Stefan Schaal. Evaluation of policy gradient methods

and variants on the cart-pole benchmark. In Approximate Dynamic Programming and Re-

inforcement Learning, (ADPRL). IEEE International Symposium on, pages 254–261, 2007.

DOI: 10.1109/adprl.2007.368196 12

[73] Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lil-

licrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for

deep reinforcement learning. In International Conference on Machine Learning, pages 1928–

1937, 2016. 12

[74] Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval

Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement

learning. ArXiv Preprint ArXiv:1509.02971, 2015. 12

[75] M. Barto and Michael T. Rosenstein. J. 4 supervised actor-critic reinforcement learning.

Handbook of Learning and Approximate Dynamic Programming, 2:359, 2004. 12

52 BIBLIOGRAPHY

[76] Vijay R. Konda and John N. Tsitsiklis. Actor-critic algorithms. In Advances in Neural In-

formation Processing Systems, pages 1008–1014, 2000. DOI: 10.1137/s0363012901385691

[77] Michael Nielsen. Neural Networks and Deep Learning. Determination Press, 2015. 13

[78] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning, MIT Press, 2016.

[79] Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong. Mathematics for Machine

Learning. Cambridge University Press, 2019. 13

[80] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. e Elements of Statistical Learn-

ing: Data Mining, Inference, and Prediction, Springer Series in Statistics, Springer, New

York, 2009. DOI: 10.1007/978-0-387-21606-5 13

[81] Francois Chollet. Deep Learning with Python. Manning Publications Co., 2017. 13

[82] Aurélien Géron. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts,

Tools, and Techniques to Build Intelligent Systems. O’Reilly Media, Inc., 2017. 13

[83] Dean A. Pomerleau. Alvinn: An autonomous land vehicle in a neural network. Advances

in Neural Information Processing Systems 1, pages 305–313, 1989. 15, 17, 24

[84] D. Pomerleau. Neural network vision for robot driving. Intelligent Unmanned Ground

Vehicles, pages 1–22, 1997. DOI: 10.1007/978-1-4615-6325-9_4 15, 17

[85] Gening Yu and I. K. Sethi. Road-following with continuous learning. In Intelligent Vehicles

Symposium. Proceedings of the, Detroit, MI, 1995. DOI: 10.1109/ivs.1995.528317 16, 17

[86] Urs Muller, Jan Ben, Eric Cosatto, Beat Flepp, and Yann L. Cun. Oﬀ-road obstacle avoid-

ance through end-to-end learning. In Advances in Neural Information Processing Systems,

pages 739–746, 2006. 16, 17

[87] Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp,

Prasoon Goyal, Lawrence D. Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, et al.

End to end learning for self-driving cars. ArXiv Preprint ArXiv:1604.07316, 2016. 16,

17, 37

[88] Hesham M. Eraqi, Mohamed N. Moustafa, and Jens Honer. End-to-end deep learn-

ing for steering autonomous vehicles considering temporal dependencies. ArXiv Preprint

ArXiv:1710.03804, 2017. 16, 17

[89] Ardalan Vahidi and Azim Eskandarian. Research advances in intelligent collision avoid-

ance and adaptive cruise control. IEEE Transactions on Intelligent Transportation Systems,

4(3):143–153, 2003. DOI: 10.1109/tits.2003.821292 18

BIBLIOGRAPHY 53

[90] Seungwuk Moon, Ilki Moon, and Kyongsu Yi. Design, tuning, and evaluation of a full-

range adaptive cruise control system with collision avoidance. Control Engineering Practice,

17(4):442–455, 2009. DOI: 10.1016/j.conengprac.2008.09.006 18

[91] Qi Sun. Cooperative adaptive cruise control performance analysis. Ph.D. thesis, Ecole

Centrale de Lille, 2016. 18

[92] Hassan K. Khalil. Nonlinear Systems, 2(5):5–1, Prentice Hall, NJ, 1996. 18

[93] Dan Wang and Jie Huang. Neural network-based adaptive dynamic surface control for a

class of uncertain nonlinear systems in strict-feedback form. IEEE Transactions on Neural

Networks, 16(1):195–202, 2005. DOI: 10.1109/tnn.2004.839354 18

[94] Xiaohui Dai, Chi-Kwong Li, and Ahmad B. Rad. An approach to tune fuzzy controllers

based on reinforcement learning for autonomous vehicle control. IEEE Transactions on

Intelligent Transportation Systems, 6(3):285–293, 2005. DOI: 10.1109/fuzz.2003.1209417

18, 21

[95] Zhenhua Huang, Xin Xu, Haibo He, Jun Tan, and Zhenping Sun. Parameter-

ized batch reinforcement learning for longitudinal control of autonomous land vehicles.

IEEE Transactions on Systems, Man, and Cybernetics: Systems, pages 1–12, 2017. DOI:

10.1109/tsmc.2017.2712561 19, 21

[96] Xin Chen, Yong Zhai, Chao Lu, Jianwei Gong, and Gang Wang. A learning model

for personalized adaptive cruise control. In Intelligent Vehicles Symposium (IV), IEEE,

pages 379–384, 2017. DOI: 10.1109/ivs.2017.7995748 19, 21

[97] Dongbin Zhao, Zhongpu Xia, and Qichao Zhang. Model-free optimal con-

trol based intelligent cruise control with hardware-in-the-loop demonstration (re-

search frontier). IEEE Computational Intelligence Magazine, 12(2):56–69, 2017. DOI:

10.1109/mci.2017.2670463 19, 21

[98] Hyunmin Chae, Chang Mook Kang, ByeoungDo Kim, Jaekyum Kim, Chung Choo

Chung, and Jun Won Choi. Autonomous braking system via deep reinforcement learn-

ing. In IEEE 20th International Conference on Intelligent Transportation Systems (ITSC),

pages 1–6, 2017. DOI: 10.1109/itsc.2017.8317839 19, 21, 24

[99] Euro NCAP. European new car assessment programme: Test protocol—AEB VRU sys-

tems, 2015. 19

[100] Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement

learning: A survey. Journal of Artiﬁcial Intelligence Research, 4:237–285, 1996. DOI:

10.1613/jair.301 20

54 BIBLIOGRAPHY

[101] Dongbin Zhao, Zhaohui Hu, Zhongpu Xia, Cesare Alippi, Yuanheng Zhu, and Ding

Wang. Full-range adaptive cruise control based on supervised adaptive dynamic program-

ming. Neurocomputing, 125:57–67, February 2014. DOI: 10.1016/j.neucom.2012.09.034

[102] Bin Wang, Dongbin Zhao, Chengdong Li, and Yujie Dai. Design and implementation

of an adaptive cruise control system based on supervised actor-critic learning. 5th Inter-

national Conference on Information Science and Technology (ICIST), pages 243–248, 2015.

DOI: 10.1109/icist.2015.7288976 20

[103] Rui Zheng, Chunming Liu, and Qi Guo. A decision-making method for autonomous

vehicles based on simulation and reinforcement learning. International Conference on Ma-

chine Learning and Cybernetics, pages 362–369, 2013. DOI: 10.1109/icmlc.2013.6890495

[104] Shai Shalev-Shwartz, Nir Ben-Zrihem, Aviad Cohen, and Amnon Shashua. Long-term

planning by short-term prediction. ArXiv Preprint ArXiv:1602.01580, 2016. 22, 26

[105] Wei Xia, Huiyun Li, and Baopu Li. A control strategy of autonomous vehicles based on

deep reinforcement learning. In Computational Intelligence and Design (ISCID), 9th Inter-

national Symposium on, vol. 2, pages 198–201, IEEE, 2016. DOI: 10.1109/iscid.2016.2054

22, 26

[106] Ahmad El Sallab, Mohammed Abdou, Etienne Perot, and Senthil Yogamani.

End-to-end deep reinforcement learning for lane keeping assist. ArXiv Preprint

ArXiv:1612.04340, 2016. 22, 26

[107] Jiakai Zhang and Kyunghyun Cho. Query-eﬃcient imitation learning for end-to-end

autonomous driving. ArXiv Preprint ArXiv:1605.06450, 2016. 22, 26

[108] Stéphane Ross, Geoﬀrey Gordon, and Drew Bagnell. A reduction of imitation learning

and structured prediction to no-regret online learning. In Proc. of the 14th International

Conference on Artiﬁcial Intelligence and Statistics, pages 627–635, 2011. 22

[109] Yunpeng Pan, Ching-An Cheng, Kamil Saigol, Keuntaek Lee, Xinyan Yan, Evange-

los eodorou, and Byron Boots. Agile autonomous driving using end-to-end deep

imitation learning. Proc. of Robotics: Science and Systems, Pittsburgh, PA, 2018. DOI:

10.15607/rss.2018.xiv.056 23, 26

[110] Dequan Wang, Coline Devin, Qi-Zhi Cai, Fisher Yu, and Trevor Darrell. Deep object

centric policies for autonomous driving. ArXiv Preprint ArXiv:1811.05432, 2018. 23, 26

[111] Horia Porav and Paul Newman. Imminent collision mitigation with reinforcement learn-

ing and vision. In 21st International Conference on Intelligent Transportation Systems (ITSC),

pages 958–964, IEEE, 2018. DOI: 10.1109/itsc.2018.8569222 23, 26

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Bibliography (3/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
Bibliography (3/4)