Determining if a model will run locally

Basically, it is determined in proportion to the file size of the model weight. There are several changes depending on whether or not to quantize with what bit, and if not quantized, whether to load with 32 bits or 16 bits.
In addition to the model size, VRAM is consumed according to the length of the generated context.

Roughly speaking, assuming 4-bit quantization, if the VRAM capacity is about the same as the model size in terms of the number of billion (B) parameters, it will work. (If you want to run an 4-bit quantized 8B model, you’ll need about 8GB. If your model is about 4.5GB, and you want to use the rest for inference… if it’s a 12B model, you’ll need about 12GB. This is just a rough guide.)

And since RoBerta is about 0.1B, you should be fine without quantization with 24GB VRAM.