Hi,
Thank you for your work. I really like the idea behind Transformers and Accelerate libraries.
I am experimenting with TrOCR fine-tuning and currently I train it on multi-gpu, but evaluate on single-gpu using following code:
if accelerator.is_main_process:
unwrapped_model = accelerator.unwrap_model(model).to(accelerator.device)
unwrapped_model.eval()
valid_cer = 0.0
with torch.no_grad():
for batch in tqdm(eval_dataloader):
outputs = unwrapped_model.generate(batch["pixel_values"].to(accelerator.device))
cer = compute_cer(pred_ids=outputs, label_ids=batch["labels"])
valid_cer += cer
accelerator.print("Validation CER:", valid_cer / len(eval_dataloader))
Is it possible to use generate
method on a parallelized model?