15.3 自适应学习率算法
15.3 自适应学习率算法⚓︎
以下为本小节目录,详情请参阅《智能之门》正版图书,高等教育出版社。
15.3.1 AdaGrad⚓︎
15.3.2 AdaDelta⚓︎
15.3.3 均方根反向传播 RMSProp⚓︎
15.3.4 Adam - Adaptive Moment Estimation⚓︎
代码位置⚓︎
ch15, Level3
参考文献⚓︎
[1] Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(Jul), 2121-2159.
[2] Zeiler, M. D. (2012). ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.
[3] Tieleman, T., & Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2), 26-31.
[4] Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.