lmflow.datasets.multi_modal_dataset#

This Python code defines a class Multi Modal Dataset.

Classes#

`CustomMultiModalDataset`	Dataset for Multi Modal data
`DataCollatorForSupervisedDataset`	Collate examples for supervised fine-tuning.

Functions#

`preprocess_multimodal_llava`(sources, data_args)
`tokenizer_image_token`(prompt, tokenizer[, ...])
`preprocess_llama_from_llava_plain`(sources, tokenizer)	This function just add the image in the front of text.
`preprocess_llama_from_llava_v1`(sources, tokenizer[, ...])	This function add the prompt and then put the image after the prompt.

Module Contents#

class lmflow.datasets.multi_modal_dataset.CustomMultiModalDataset(dataset_path: str, data_args: lmflow.args.DatasetArguments)[source]#

Bases: torch.utils.data.Dataset

Dataset for Multi Modal data

data_dict[source]#

data_args[source]#

image_folder[source]#

__len__()[source]#

register_tokenizer(tokenizer, image_processor=None)[source]#

__getitem__(i)[source]#

lmflow.datasets.multi_modal_dataset.preprocess_multimodal_llava(sources, data_args)[source]#

lmflow.datasets.multi_modal_dataset.tokenizer_image_token(prompt, tokenizer, image_token_index=IMAGE_TOKEN_INDEX, return_tensors=None)[source]#

lmflow.datasets.multi_modal_dataset.preprocess_llama_from_llava_plain(sources, tokenizer: transformers.PreTrainedTokenizer, has_image: bool = False)[source]#

This function just add the image in the front of text. And don’t add any prompt. Args:

sources: The input data with text and image. tokenizer: The tokenizer to process text. has_image: Whether the input data has image.

Returns:: The input_ids and labels for the model.

lmflow.datasets.multi_modal_dataset.preprocess_llama_from_llava_v1(sources, tokenizer: transformers.PreTrainedTokenizer, has_image: bool = False)[source]#

This function add the prompt and then put the image after the prompt. So it needs additional code to generate the target label. Args:

sources: The input data with text and image. tokenizer: The tokenizer to process text. has_image: Whether the input data has image.

Returns:: The input_ids and labels for the model.

class lmflow.datasets.multi_modal_dataset.DataCollatorForSupervisedDataset[source]#

Collate examples for supervised fine-tuning.

tokenizer: transformers.PreTrainedTokenizer[source]#

__call__(instances)[source]#

lmflow.datasets.multi_modal_dataset#

Classes#

Functions#

Module Contents#

This Page