There is a type change. The trainer class fails to loop over it.
File "/home/transformers_datasets_test/model/trainer.py", line 88, in encode
return tokenizer(batch["text"], truncation=True, max_length=tokenizer.model_max_length)
KeyError: 'text'
When I check batch instead of batch[“text”], I get an empty dict {} which is not expected
However:
train_dataset["train"][:3]
Works as expected outside the trainer class