lmflow.optim.adamw_schedule_free#

Classes#

AdamWScheduleFree

Schedule-Free AdamW

Module Contents#

class lmflow.optim.adamw_schedule_free.AdamWScheduleFree(params, lr=0.0025, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, warmup_steps=0, r=0.0, weight_lr_power=2.0, foreach=hasattr(torch, '_foreach_mul_'))[source]#

Bases: torch.optim.Optimizer

Schedule-Free AdamW As the name suggests, no scheduler is needed with this optimizer. To add warmup, rather than using a learning rate schedule you can just set the warmup_steps parameter.

This optimizer requires that .train() and .eval() be called before the beginning of training and evaluation respectively. The optimizer should also be placed in eval mode when saving checkpoints.

eval()[source]#
train()[source]#
step(closure=None)[source]#

Performs a single optimization step.

Arguments:
closure (callable, optional): A closure that reevaluates the model

and returns the loss.