lmflow.utils.flash_attention.bloom_flash_attention#
Functions#
|
|
|
|
Module Contents#
- lmflow.utils.flash_attention.bloom_flash_attention.forward(self, hidden_states: torch.Tensor, residual: torch.Tensor, alibi: torch.Tensor, attention_mask: torch.Tensor, layer_past: Tuple[torch.Tensor, torch.Tensor] | None = None, head_mask: torch.Tensor | None = None, use_cache: bool = False, output_attentions: bool = False)[source]#