Hello, I am coding DeepSeek LLM from scratch, and I want to load the weights from deepseek-ai/deepseek-llm-7b-base · Hugging Face into the model once I finish
i.e.
model_name = "deepseek-ai/deepseek-llm-7b-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
Except I use the transformer class I made instead of AutoModelForCausalLM.from_pretrained, how can I achieve this?
here is the github repo: GitHub - deepseek-ai/DeepSeek-LLM: DeepSeek LLM: Let there be answers
thank you