Why does automodelforcausallm.from_pretrained() work on base models and not instruct models?

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")

loads the model successfully, but

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")

results in the following error

Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory meta-llama/Llama-3.1-8B-Instruct.
  File "train.py", line 59, in <module>
    model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct", token=access_token)
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory meta-llama/Llama-3.1-8B-Instruct.
1 Like

If you try to read a file that is not in the Hugging Face format, you may get that error, but it looks like it’s in the Hugging Face format…

Only the original folder has its own format…

Weird. Do you also get this error msg with Llama-3.1-70B-Instruct?
I would download the model first and set the appropriate path.
Worked for me.

def download_model_to_cache(model_id: str):    
    try:
        # Download full model snapshot to cache
        snapshot_download(repo_id=model_id, local_dir=None)
        print("\nâś“ Model successfully downloaded to cache!")
    except Exception as e:
        print(f"\n❌ Error downloading {model_id}: {str(e)}")
        raise```
1 Like

Same here. I managed to resolve this problem by downloading the model first with huggingface-cli download xxx and then explicitly pointing to the download path (as observed above you might have to convert_llama_weights_to_hf.py if the model weights are not in hf format.
In sum, explicitly downloading the model works, just not sure why loading the model with from_pretrained() fails

2 Likes

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.