Instantiating multiple heads at the same time

I have been using the same backbone/language model (BERT) that I have used for several different classification heads. I’m working on a project that needs several models running at once. I am unfortunately running into CUDA out of memory errors.

Is it possible to “share” the common backbone of my model to take up less space on my GPU?