Why don't I have access to all the GPU's VRAM?

I run out of memory for my space very often. When I do, it shows something like this:
RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 7.93 GiB total capacity; 4.04 GiB already allocated; 470.94 MiB free; 4.16 GiB reserved in total by PyTorch)
My question is…why do I have only 4GB of VRAM to play with?
Why is the rest of the memory reserved?
I have access to all 12GB of T4 VRAM on Google Colab and never run out of memory there.
What am I doing wrong?
Am I being stupid?

Thank you and God Bless

hi @cumprod were you using a T4 - small instance? can you share your Space link?

Yep I’m using a T4 small instance.
https://huggingface.co/spaces/cumprod/xbox

Thanks for looking