Using from_pretrained

Hello, I am coding DeepSeek LLM from scratch, and I want to load the weights from deepseek-ai/deepseek-llm-7b-base · Hugging Face into the model once I finish

i.e.

model_name = "deepseek-ai/deepseek-llm-7b-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")

Except I use the transformer class I made instead of AutoModelForCausalLM.from_pretrained, how can I achieve this?
here is the github repo: GitHub - deepseek-ai/DeepSeek-LLM: DeepSeek LLM: Let there be answers
thank you

1 Like

I think this function is good to use when loading a Sharded model without using from_pretrained.