2024年07月27日 星期六 登录 EN

学术活动
Adam-family Methods with Decoupled Weight Decay in Deep Learning
首页 - 学术活动
报告人:
Nachuan Xiao, Doctor, National University of Singapore
邀请人:
Xin Liu, Professor
题目:
Adam-family Methods with Decoupled Weight Decay in Deep Learning
时间地点:
9:00-10:00 November 28(Tuesday), Tencent Meeting: 662-870-148
摘要:

In this talk, we investigate the convergence properties of a wide class of Adam-family methods for minimizing quadratically regularized nonsmooth nonconvex optimization problems, especially in the context of training nonsmooth neural networks with weight decay. Motivated by the AdamW method, we propose a novel framework for Adam-family methods with decoupled weight decay. Under mild assumptions, we prove that our framework converges with non-diminishing stepsizes to the variables. More importantly, we show that our proposed framework asymptotically approximates the SGD method, thereby providing an explanation for empirical observations that decoupled weight decay enhances generalization performance for Adam-family methods. As a practical application of our proposed framework, we propose a novel Adam-family method named  Adam with Decoupled Weight Decay (AdamD), and establish its convergence properties under mild conditions. Numerical experiments demonstrate that AdamD outperforms Adam and is comparable to AdamW, in the aspects of both generalization performance and efficiency.