Why does automodelforcausallm.from_pretrained() work on base models and not instruct models?

QiyaoWei · March 14, 2025, 4:31pm

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B")

loads the model successfully, but

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")

results in the following error

Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory meta-llama/Llama-3.1-8B-Instruct.
  File "train.py", line 59, in <module>
    model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct", token=access_token)
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory meta-llama/Llama-3.1-8B-Instruct.

John6666 · March 14, 2025, 11:43pm

If you try to read a file that is not in the Hugging Face format, you may get that error, but it looks like it’s in the Hugging Face format…

Only the original folder has its own format…

github.com/meta-llama/llama-models

Error no file named pytorch_model.bin, model.safetensors

opened 11:10AM - 28 Sep 24 UTC

morbidod

Hello, I successfully downloaded the model to this directory /root/.llama/che…ckpoints/Llama3.2-1B-Instruct When I launch the AutoModelForCausalLM.from_pretrained passing the path above I got the following error: OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /root/.llama/checkpoints/Llama3.2-1B-Instruct. (AutomodelForCasualLM is from latest transformers library (pip install -U transformers). Thanks in advance for any suggestion.

anivader · March 15, 2025, 3:54am

Weird. Do you also get this error msg with Llama-3.1-70B-Instruct?
I would download the model first and set the appropriate path.
Worked for me.

def download_model_to_cache(model_id: str):    
    try:
        # Download full model snapshot to cache
        snapshot_download(repo_id=model_id, local_dir=None)
        print("\n✓ Model successfully downloaded to cache!")
    except Exception as e:
        print(f"\n❌ Error downloading {model_id}: {str(e)}")
        raise```

QiyaoWei · March 15, 2025, 7:35pm

Same here. I managed to resolve this problem by downloading the model first with huggingface-cli download xxx and then explicitly pointing to the download path (as observed above you might have to convert_llama_weights_to_hf.py if the model weights are not in hf format.
In sum, explicitly downloading the model works, just not sure why loading the model with from_pretrained() fails

system · March 16, 2025, 7:35am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
AutoModelForCausalLM.from_pretrained unable to load model from Huggingface 🤗Transformers	1	3117	June 25, 2023
Unable to download models from HF with from_pretrained() Beginners	2	1161	December 7, 2023
AutoModelForCausalLM.from_pretrained gets stuck when loading model from local folder Beginners	2	3246	January 1, 2025
Loading a safetensors format model using Hugging Face Transformers 🤗Transformers	2	4738	September 13, 2023
AutoModelForCausalLM.from_pretrained refuses to load safetensors weights Intermediate	0	952	December 5, 2023

Why does automodelforcausallm.from_pretrained() work on base models and not instruct models?

Related topics