lmflow.optim.sophia#

Classes#

SophiaG

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training.

Module Contents#

class lmflow.optim.sophia.SophiaG(params, lr=0.0001, betas=(0.965, 0.99), rho=0.04, weight_decay=0.1, *, maximize: bool = False, capturable: bool = False)[source]#

Bases: torch.optim.optimizer.Optimizer

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training. Code from: Liuhong99/Sophia

__setstate__(state)[source]#
update_hessian()[source]#
step(closure=None, bs=5120)[source]#