lmflow.optim.sophia#
Classes#
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training. |
Module Contents#
- class lmflow.optim.sophia.SophiaG(params, lr=0.0001, betas=(0.965, 0.99), rho=0.04, weight_decay=0.1, *, maximize: bool = False, capturable: bool = False)[source]#
Bases:
torch.optim.optimizer.Optimizer
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training. Code from: Liuhong99/Sophia