lmflow.pipeline.finetuner#

The Finetuner class simplifies the process of running finetuning process on a language model for a TunableModel instance with given dataset.

Attributes#

Classes#

Finetuner

Initializes the Finetuner class with given arguments.

Module Contents#

lmflow.pipeline.finetuner.logger[source]#
class lmflow.pipeline.finetuner.Finetuner(model_args: lmflow.args.ModelArguments, data_args: lmflow.args.DatasetArguments, finetuner_args: lmflow.args.FinetunerArguments, *args, **kwargs)[source]#

Bases: lmflow.pipeline.base_tuner.BaseTuner

Initializes the Finetuner class with given arguments.

Parameters:
model_argsModelArguments object.

Contains the arguments required to load the model.

data_argsDatasetArguments object.

Contains the arguments required to load the dataset.

finetuner_argsFinetunerArguments object.

Contains the arguments required to perform finetuning.

argsOptional.

Positional arguments.

kwargsOptional.

Keyword arguments.

model_args[source]#
data_args[source]#
finetuner_args[source]#
last_checkpoint = None[source]#
group_text(tokenized_datasets, model_max_length)[source]#

Groups texts together to form blocks of maximum length model_max_length and returns the processed data as a dictionary.

create_customized_optimizer(base_trainer_class, model_args)[source]#
tune(model: lmflow.models.hf_decoder_model.HFDecoderModel | lmflow.models.hf_text_regression_model.HFTextRegressionModel | lmflow.models.hf_encoder_decoder_model.HFEncoderDecoderModel, dataset: lmflow.datasets.dataset.Dataset, transform_dataset_in_place=True, data_collator=None)[source]#

Perform tuning for a model

Parameters:
modelTunableModel object.

TunableModel to perform tuning.

dataset:

dataset to train model.