lmflow.optim.adan#

Classes#

Adan

Implements a pytorch variant of Adan.

Functions#

_single_tensor_adan(params, grads, exp_avgs, ...)

_multi_tensor_adan(params, grads, exp_avgs, ...)

Module Contents#

class lmflow.optim.adan.Adan(params, lr=0.001, betas=(0.98, 0.92, 0.99), eps=1e-08, weight_decay=0.0, max_grad_norm=0.0, no_prox=False, foreach: bool = True)[source]#

Bases: torch.optim.optimizer.Optimizer

Implements a pytorch variant of Adan.

Adan was proposed in Adan : Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models. https://arxiv.org/abs/2208.06677

__setstate__(state)[source]#
restart_opt()[source]#
step()[source]#

Performs a single optimization step.

lmflow.optim.adan._single_tensor_adan(params: List[torch.Tensor], grads: List[torch.Tensor], exp_avgs: List[torch.Tensor], exp_avg_sqs: List[torch.Tensor], exp_avg_diffs: List[torch.Tensor], pre_grads: List[torch.Tensor], *, beta1: float, beta2: float, beta3: float, bias_correction1: float, bias_correction2: float, bias_correction3_sqrt: float, lr: float, weight_decay: float, eps: float, no_prox: bool, clip_global_grad_norm: torch.Tensor)[source]#
lmflow.optim.adan._multi_tensor_adan(params: List[torch.Tensor], grads: List[torch.Tensor], exp_avgs: List[torch.Tensor], exp_avg_sqs: List[torch.Tensor], exp_avg_diffs: List[torch.Tensor], pre_grads: List[torch.Tensor], *, beta1: float, beta2: float, beta3: float, bias_correction1: float, bias_correction2: float, bias_correction3_sqrt: float, lr: float, weight_decay: float, eps: float, no_prox: bool, clip_global_grad_norm: torch.Tensor)[source]#