15.3 自适应学习率算法

15.3 自适应学习率算法⚓︎

以下为本小节目录，详情请参阅《智能之门》正版图书，高等教育出版社。

15.3.1 AdaGrad⚓︎

15.3.2 AdaDelta⚓︎

15.3.3 均方根反向传播 RMSProp⚓︎

15.3.4 Adam - Adaptive Moment Estimation⚓︎

代码位置⚓︎

ch15, Level3

参考文献⚓︎

[1] Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(Jul), 2121-2159.

[2] Zeiler, M. D. (2012). ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.

[3] Tieleman, T., & Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2), 26-31.

[4] Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.