lmflow.pipeline.inferencer#
The Inferencer class simplifies the process of model inferencing.
Attributes#
Classes#
Initializes the Inferencer class with given arguments. |
|
Ref: [arXiv:2211.17192v2](https://arxiv.org/abs/2211.17192) |
|
Initializes the ToolInferencer class with given arguments. |
Functions#
|
Module Contents#
- class lmflow.pipeline.inferencer.Inferencer(model_args, data_args, inferencer_args)[source]#
Bases:
lmflow.pipeline.base_pipeline.BasePipeline
Initializes the Inferencer class with given arguments.
- Parameters:
- model_argsModelArguments object.
Contains the arguments required to load the model.
- data_argsDatasetArguments object.
Contains the arguments required to load the dataset.
- inferencer_argsInferencerArguments object.
Contains the arguments required to perform inference.
- create_dataloader(dataset: lmflow.datasets.dataset.Dataset)[source]#
Batchlize dataset and format it to dataloader.
- Args:
dataset (Dataset): the dataset object
- Output:
dataloader (batchlize): the dataloader object dataset_size (int): the length of the dataset
- inference(model, dataset: lmflow.datasets.dataset.Dataset, max_new_tokens: int = 100, temperature: float = 0.0, prompt_structure: str = '{input}', remove_image_flag: bool = False, chatbot_type: str = 'mini_gpt')[source]#
Perform inference for a model
- Parameters:
- modelTunableModel object.
TunableModel to perform inference
- datasetDataset object.
- Returns:
- output_dataset: Dataset object.
- class lmflow.pipeline.inferencer.SpeculativeInferencer(model_args, draft_model_args, data_args, inferencer_args)[source]#
Bases:
Inferencer
Ref: [arXiv:2211.17192v2](https://arxiv.org/abs/2211.17192)
- Parameters:
- target_model_argsModelArguments object.
Contains the arguments required to load the target model.
- draft_model_argsModelArguments object.
Contains the arguments required to load the draft model.
- data_argsDatasetArguments object.
Contains the arguments required to load the dataset.
- inferencer_argsInferencerArguments object.
Contains the arguments required to perform inference.
- static score_to_prob(scores: torch.Tensor, temperature: float = 0.0, top_p: float = 1.0) torch.Tensor [source]#
Convert scores (NOT softmaxed tensor) to probabilities with support for temperature, top-p sampling, and argmax.
- Parameters:
- scorestorch.Tensor
Input scores.
- temperaturefloat, optional
Temperature parameter for controlling randomness. Higher values make the distribution more uniform, lower values make it peakier. When temperature <= 1e-6, argmax is used. by default 0.0
- top_pfloat, optional
Top-p sampling parameter for controlling the cumulative probability threshold, by default 1.0 (no threshold)
- Returns:
- torch.Tensor
Probability distribution after adjustments.
- static sample(prob: torch.Tensor, num_samples: int = 1) Dict [source]#
Sample from a tensor of probabilities
- static predict_next_token(model: lmflow.models.hf_decoder_model.HFDecoderModel, input_ids: torch.Tensor, num_new_tokens: int = 1)[source]#
Predict the next token given the input_ids.
- autoregressive_sampling(input_ids: torch.Tensor, model: lmflow.models.hf_decoder_model.HFDecoderModel, temperature: float = 0.0, num_new_tokens: int = 5) Dict [source]#
Ref: [arXiv:2211.17192v2](https://arxiv.org/abs/2211.17192) Section 2.2
- inference(model: lmflow.models.hf_decoder_model.HFDecoderModel, draft_model: lmflow.models.hf_decoder_model.HFDecoderModel, input: str, temperature: float = 0.0, gamma: int = 5, max_new_tokens: int = 100)[source]#
Perform inference for a model
- Parameters:
- modelHFDecoderModel object.
TunableModel to verify tokens generated by the draft model.
- draft_modelHFDecoderModel object.
TunableModel that provides approximations of the target model.
- inputstr.
The input text (i.e., the prompt) for the model.
- gammaint.
The number of tokens to be generated by the draft model within each iter.
- max_new_tokensint.
The maximum number of tokens to be generated by the target model.
- Returns:
- output: str.
The output text generated by the model.
- class lmflow.pipeline.inferencer.ToolInferencer(model_args, data_args, inferencer_args)[source]#
Bases:
Inferencer
Initializes the ToolInferencer class with given arguments.
- Parameters:
- model_argsModelArguments object.
Contains the arguments required to load the model.
- data_argsDatasetArguments object.
Contains the arguments required to load the dataset.
- inferencer_argsInferencerArguments object.
Contains the arguments required to perform inference.
- inference(model: lmflow.models.hf_decoder_model.HFDecoderModel, input: str, max_new_tokens: int = 1024)[source]#
Perform inference for a model
- Parameters:
- modelHFDecoderModel object.
TunableModel to perform inference
- inputstr.
The input text (i.e., the prompt) for the model.
- max_new_tokensint.
The maximum number of tokens to be generated by the model.
- Returns:
- outputstr.
The output text generated by the model.