262 Handbook of Big Data
Korattikara, A., Chen, Y., and Welling, M. (2014). Austerity in MCMC land: Cutting
the Metropolis-Hastings budget. In Proceedings of the 31st International Conference on
Machine Learning, pp. 181–189.
Krakowski, K. A., Mahony, R. E., Williamson, R. C., and Warmuth, M. K. (2007). A
geometric view of non-linear online stochastic gradient descent. Author Website.
Kulis, B. and Bartlett, P. L. (2010). Implicit online learning. In Proceedings of the 27th
International Conference on Machine Learning, pp. 575–582.
Lai, T. L. and Robbins, H. (1979). Adaptive design and stochastic approximation. Annals
of Statistics, 7:1196–1221.
Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm. Journal of
the Royal Statistical Society. Series B (Methodological), Wiley, 57(2):425–437.
Le Cun, L. B. Y. and Bottou, L. (2004). Large scale online learning. Advances in Neural
Information Processing Systems, 16:217–224.
Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, volume 31. Springer,
New York.
Lehmann, E. L. and Casella, G. (2003). Theory of Point Estimation, 2nd edition. Springer,
New York.
Lions, P.-L. and Mercier, B. (1979). Splitting algorithms for the sum of two nonlinear
operators. SIAM Journal on Numerical Analysis, 16(6):964–979.
Liu, Z., Almhana, J., Choulakian, V., and McGorman, R. (2006). Online EM algorithm for
mixture with application to internet traffic modeling. Computational Statistics & Data
Analysis, 50(4):1052–1071.
Ljung, L., Pflug, G., and Walk, H. (1992). Stochastic Approximation and Optimization of
Random Systems, volume 17. Springer Basel AG, Basel, Switzerland.
Moulines, E. and Bach, F. R. (2011). Non-asymptotic analysis of stochastic approximation
algorithms for machine learning. In Advances in Neural Information Processing Systems,
pp. 451–459.
Murata, N. (1998). A statistical study of online learning. Online Learning and Neural
Networks. Cambridge University Press, Cambridge.
Nagumo, J.-I. and Noda, A. (1967). A learning method for system identification. IEEE
Transactions on Automatic Control, 12(3):282–287.
National Research Council (2013). Frontiers in Massive Data Analysis. National Academies
Press, Washington, DC.
Neal, R. (2011). MCMC using Hamiltonian dynamics. Handbook of Markov Chain Monte
Carlo, volume 2, pp. 113–162. Chapman & Hall/CRC Press, Boca Raton, FL.
Neal, R. M. and Hinton, G. E. (1998). A view of the EM algorithm that justifies incremental,
sparse, and other variants. In Jordan, M. I. (ed.), Learning in Graphical Models, pp. 355–
368. Springer, Cambridge, MA.
Nemirovski, A., Juditsky, A., Lan, G., and Shapiro, A. (2009). Robust stochastic
approximation approach to stochastic programming. SIAM Journal on Optimization,
19(4):1574–1609.