I’m curious to try either the 3B or even the big 11B T5 model (preferably in the pipeline) for summarization.
Are these two T5 models usable via the pipeline? (my understanding is they are, but maybe I’m failing already at this point)
If they are, my guess is that my VM configuration must not be enough because I can only load “t5-large” on both a google Notebook (15GB vCPU, 1xTesla T4) or my datalore pro account (16GB RAM, 1xTesla T4)
Is this enough for the 3B model, or do I need at least 2 GPUs / 32vCPU’s?
I was aware of that notebook but could only see the comment about the Colab TPU not having enough memory to fine-tune the 11B model but I thought that there must be some RAM advice on how much you need to run at least the 3B model.
Actually just asked for more vCPU on GCP and got the 3B model running with 32 vCPU if anyone is interested.