I am working in a vision use case. I have the processor:
processor = ViTImageProcessor.from_pretrained('google/vit-base-patch16-224')
Then I process my dataset as follow:
def apply_processor(example):
example['pixel_values'] = processor(example['image'].convert("RGB"), return_tensors="pt").pixel_values.squeeze()
return example
processed_dataset = pet_dataset.map(apply_processor)
Considering this, should I also add the tokenizer = processor
in the transformers.Trainer ? If not, which one is the best option, doing the map/transform/etc or doing the tokenizer=processor?
Thanks in advance!