Colab RAM Limit Exceeded: Unable to Run 3B Model Even with Quantization

obi77 · August 8, 2023, 8:26pm

I am currently trying the new model stabilityai/stablecode-completion-alpha-3b on a free colab notebook with gpu (12 gigabyte in system ram and 14 gigabyte in gpu ram T4)

this the code I am using :

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("stabilityai/stablecode-completion-alpha-3b")
model = AutoModelForCausalLM.from_pretrained(
  "stabilityai/stablecode-completion-alpha-3b",
  trust_remote_code=True,
  load_in_8bit=True,
  device="cuda",
)

as soon as the model is starting to load the sys RAM starts to fill until I get the warning that the RAM has been exausted and the envirement is restarted. I don’t know why the model is being loaded to sys RAM instead of the gpu RAM, and the 12 gigabytes of RAM is used up for a quantized 3b model

Topic		Replies	Views
General question about large model loading 🤗Accelerate	2	917	November 28, 2024
Llama-2 on colab Beginners	3	11379	November 28, 2023
Running out of System RAM while loading BLIP2 on Colab? 🤗Transformers	0	391	November 15, 2023
Colab's session crashed after using all available RAM when loading falcon-7B Beginners	2	1204	October 26, 2023
Colab RAM crash error - Fine-tuning RoBERTa in Colab Beginners	3	6490	December 15, 2020

Colab RAM Limit Exceeded: Unable to Run 3B Model Even with Quantization

Related topics