lmflow.optim.sgd_schedule_free#
Classes#
Schedule-Free SGD |
Module Contents#
- class lmflow.optim.sgd_schedule_free.SGDScheduleFree(params, lr=1.0, momentum=0.9, weight_decay=0, warmup_steps=0, r=0.0, weight_lr_power=2, foreach=hasattr(torch, '_foreach_mul_'))[source]#
Bases:
torch.optim.Optimizer
Schedule-Free SGD As the name suggests, no scheduler is needed with this optimizer. To add warmup, rather than using a learning rate schedule you can just set the warmup_steps parameter.
This optimizer requires that .train() and .eval() be called before the beginning of training and evaluation respectively. The optimizer should also be placed in eval mode when saving checkpoints.