Hi there!
How can I load falcon-7b in anything that requires less vRAM than bfloat? When I try this, Colab model = AutoModelForCausalLM.from_pretrained("tiiuae/falcon-7b-instruct",trust_remote_code=True, device_map="auto", load_in_8bit=True, )
session crashes even though I’m using gpu wih 15 gb of ram