Vision Transfomer issue with broadcasting shapes


I am using image transformers from HF but although my images are RGB, I keep getting the following error:

operands could not be broadcast together with shapes (224,224) (3,) (224,224) .

Any idea on how I can dynamically fix this? Should I do this using collate?


Could you provide a code snippet to reproduce your error?

You’re sure that you did image.convert(“RGB”)?

I added the following portion:

data = load_dataset(“venetis/VMMRdb_make_model”)

def transforms(examples):
examples[“image”] = [image.convert(“RGB”) for image in examples[“image”]]

data =,batched=True)

and it seems to produce the following error:
UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f0b3a27dd70>