While training a LayoutLM V2 model with a QA head we noticed that the evaluation loop stops using the GPU and will take hours to complete a single loop.
Any ideas what could be happening here?
While training a LayoutLM V2 model with a QA head we noticed that the evaluation loop stops using the GPU and will take hours to complete a single loop.
Any ideas what could be happening here?
The evaluation for question-answering is pretty long as the post-processing (going from the predictions of the models to spans of texts in the contexts) is not on the GPU and is rather long. If it’s for an evaluation during training, you should use a smaller validation set.