Clarification on Batch mapping

When batched=True, both examples['final'] and examples['raw'] will be a list of batch_size elements. You may have to use

def tokenizer_func(examples):
        return tokenizer([generate_prompt(raw_text) for raw_text in examples['raw']]),
                                  truncation=True, padding=True, max_length=128, return_tensors="pt")