Trainer use multigpu

I read many discussion,they tell me if I use trainer API, I can automatically use multi-gpu. But in my case, it is not true

I run the pytorch version example run_mlm.py with model bert-base-chinese and my own train/valid dataset. But I find the GPU-Util is low, but the cpu is full. How Can I fix the problem, and use GPU-Util is full. when I use Accelerate library, the GPU-Util is almost 100% with the same params.

image

image