Regarding the eval batch size for large models

I am training T5-base with 2040 effective batch size (train_batch_size_per_gpu=8, grad.accumartion_steps=51, n_gpus=8) on 8 v100 GPUs. The training works fine w/o any memory error. But during the evaluation step (with eval_batch_size_per_gpu=8), I get MemoryError though its the same size as train batch size. Its weird that I am getting the memory error just during evaluation.

Also, I tried reducing the eval_batch_size_per_gpu=4 (halved), even then, I get memory error. Can someone help me with whats going on?