CUDA out of memory on Nvidia A10G + Codellama on HuggingFace Spaces


The 2xA10G large already provide 48GB VRAM, but out of memory still occurred, how could I fix this?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 896.00 MiB. GPU 0 has a total capacty of 21.99 GiB of which 599.00 MiB is free. Process 252091 has 21.39 GiB memory in use. Of the allocated memory 20.77 GiB is allocated by PyTorch, and 345.35 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF