Unable to load ALMA-13B model from HF

Garsa3112 · December 13, 2023, 4:41am

I am trying to load the 13B model of ALMA by haoranxu from Hugging Face which in itself is probably 55GB huge so takes time to load, which is fine. But post downloading the code feels like is stuck in an infinite loop and does not move forward and does not load anything else.

For reference I am using this code:

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM
from transformers import LlamaTokenizer
#device = “cuda:0” if torch.cuda.is_available() else “cpu”

Load base model and LoRA weights

model = AutoModelForCausalLM.from_pretrained(“haoranxu/ALMA-13B”, torch_dtype=torch.float16, device_map=“auto”)
#model = PeftModel.from_pretrained(model, “haoranxu/ALMA-13B-Pretrain-LoRA”)
tokenizer = LlamaTokenizer.from_pretrained(“haoranxu/ALMA-13B-Pretrain”, padding_side=‘left’)

Add the source setence into the prompt template

prompt=“Translate this from Chinese to English:\nChinese: 我爱机器翻译。\nEnglish:”
input_ids = tokenizer(prompt, return_tensors=“pt”, padding=True, max_length=40, truncation=True).input_ids.cuda()

Translation

with torch.no_grad():
generated_ids = model.generate(input_ids=input_ids, num_beams=5, max_new_tokens=20, do_sample=True, temperature=0.6, top_p=0.9)
outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
print(outputs)

Any help will be appreciated, thanks in advance

Topic		Replies	Views
Loadig the LLAMA 30B Model. Memory Issue Models	2	2163	July 27, 2023
Having trouble loading a fine-tuned PEFT model (CodeLlama-13b-Instruct-hf base) 🤗Transformers	2	4311	October 6, 2024
Loading and using Autotrain model error 🤗AutoTrain	0	641	November 28, 2023
Unable to load a FineTuned LLama Model to GPU for inference Beginners	3	2974	December 15, 2023
Loaded adapter seems ignored Beginners	0	187	May 24, 2024

Unable to load ALMA-13B model from HF

Load base model and LoRA weights

Add the source setence into the prompt template

Translation

Related topics