The image processor and tokenizer along with model is defined as -
image_processor = AutoImageProcessor.from_pretrained(“microsoft/swinv2-tiny-patch4-window8-256”)
tokenizer = XLMRobertaTokenizerFast.from_pretrained(“FacebookAI/xlm-roberta-base”)
model = VisionEncoderDecoderModel.from_encoder_decoder_pretrained(
“microsoft/swinv2-tiny-patch4-window8-256”,
“FacebookAI/xlm-roberta-base”
)
What should be the processing_class argument for this setup?
Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=val_dataset,
processing_class= ? ,
compute_metrics=compute_cer,
data_collator=collate_fn,
)