Papers
arxiv:2110.10828

AdamD: Improved bias-correction in Adam

Published on Oct 22, 2021
Authors:

Abstract

A modified Adam optimizer bias-correction approach improves early training gradient updates by retaining second-moment bias-correction while removing first-order bias-correction.

AI-generated summary

Here I present a small update to the bias-correction term in the Adam optimizer that has the advantage of making smaller gradient updates in the first several steps of training. With the default bias-correction, Adam may actually make larger than requested gradient updates early in training. By only including the well-justified bias-correction of the second moment gradient estimate, v_t, and excluding the bias-correction on the first-order estimate, m_t, we attain these more desirable gradient update properties in the first series of steps. The default implementation of Adam may be as sensitive as it is to the hyperparameters β_1, β_2 partially due to the originally proposed bias correction procedure, and its behavior in early steps.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2110.10828 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2110.10828 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2110.10828 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.