Using Trainer at inference time

Isabella · August 23, 2021, 9:59am

Yep, this works fine as long as we have few sentences to process, but in my case, with about 20,000 of them, I soon run out of memory if I try to pass all sentence encodings to model() at once. I guess I could write a for loop around the forward pass to process one sentence at a time but it doesn’t look very performant. The “right” way, I guess, is to run inference on mid-sized batches, which is what Trainer.predict() does under the hoods - so I was being lazy and tried to make advantage of that, rather than writing the batching process myself

Topic		Replies	Views
How do I use a fine-tuned Trainer model for inference correctly? 🤗Transformers	0	983	June 9, 2023
How to do inference with fined-tuned huggingface models? 🤗Transformers	3	820	February 4, 2024
Looking for tool class to do predictions 🤗Transformers	3	551	October 9, 2020
Batch size for trainer.predict() 🤗Transformers	4	6911	November 26, 2022
How to make single-input inference faster? Create my own pipeline? 🤗Transformers	9	3949	August 26, 2021

Using Trainer at inference time

Related topics