分享

Adam with model exponential moving average is effective for nonconvex optimization

热度