Iβm trying to base a training loop using this script but I get the following issue:
50%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 77/154 [00:52<00:43, 1.77it/s][INFO|trainer.py:3829] 2024-09-03 02:07:25,225 >>
***** Running Evaluation *****
[INFO|trainer.py:3831] 2024-09-03 02:07:25,225 >> Num examples = 305
[INFO|trainer.py:3834] 2024-09-03 02:07:25,225 >> Batch size = 8
0%| | 0/20 [00:00<?, ?it/s]
Where after the first epoch, the evaluation loop tries to start but it stays at 0% without any movement and not even an error. The GPU usage meanwhile is at 100% while this is happening, and the memory is also being consumed.
Is there a way to add even more debug logs, or has someone come across this issue before please? Thanks.