Does Trainer prefetch data?

Hi everyone,

I’m pretty new to this. I’m trying to train a transformer model on a GPU using transformers.Trainer.

I’m doing my prototyping at home on a Windows 10 machine with a 4-core CPU with a 1060 gtx. I have my data, model, and trainer all set up, and my dataset is of type torch.utils.data.Dataset. Based on what I see in the task manager, it looks like Trainer might not be prefetching data to keep the GPU busy at all times. Here’s what I see during training:

r

As you can see, GPU usage maxes out around 55%, and cycles down to 0% regularly during training.

I can iterate through my dataset around 10x faster than it takes for Trainer to train the model for one epoch, so I don’t think data loading is the bottleneck.

So, any ideas why I am seeing this type of behavior with my GPU? Is Trainer not prefetching the data for each training step, or is it some other issue with my code?

In case anyone else is wondering about this, I figured it out.

Trainer indeed appears to prefetch data. The problem was that my data loader was too slow to keep up with the GPU.

After optimizing my data loading routine, I’m able to keep the GPU busy constantly.