Qwen 'padding_side = right' problem

You are attempting to perform batched generation with padding_side=‘right’ this may lead to unexpected behaviour for Flash Attention version of Qwen2. Make sure to call tokenizer.padding_side = 'left' before tokenizing the input
this is the problem although it set to left

Temp Solution(if “tokenizer.padding_side = ‘left’” didn’t work ):
=============> find ‘transformers/models/qwen2/modeling_qwen2.py’
and change line ~622 or comment

    if self.config._attn_implementation == "flash_attention_2":
        if attention_mask is not None and past_key_values is not None:
            is_padding_right = attention_mask[:, -1].sum().item() != input_tensor.size()[0]
            if is_padding_right:
                print('I dont gaf')
                continue
                # raise ValueError(
                #     "You are attempting to perform batched generation with padding_side='right'"
                #     " this may lead to unexpected behaviour for Flash Attention version of Qwen2. Make sure to "
                #     " call `tokenizer.padding_side  = 'left'` before tokenizing the input. "
                # )
        if attention_mask is not None and 0.0 in attention_mask:
            return attention_mask
        return None
1 Like