Because of some dastardly security block, I’m unable to download a model (specifically distilbert-base-uncased) through my IDE. Specifically, I’m using simpletransformers (built on top of huggingface, or at least uses its models). I tried the from_pretrained method when using huggingface directly, also, but the error is the same:
OSError: Can’t load weights for ‘distilbert-base-uncased’
From where can I download this pretrained model so that I can load it locally?
# In a google colab install git-lfs
!sudo apt-get install git-lfs
!git lfs install
# Then
!git clone https://huggingface.co/ORGANIZATION_OR_USER/MODEL_NAME
from transformers import AutoModel
model = AutoModel.from_pretrained('./MODEL_NAME')
For instance:
# In a google colab install git-lfs
!sudo apt-get install git-lfs
!git lfs install
# Then
!git clone https://huggingface.co/facebook/bart-base
from transformers import AutoModel
model = AutoModel.from_pretrained('./bart-base')
Hi thomwolf,
But how to download only the pytorch model? I found that git cone also downloads tensorflows models, which is useless and time-comsuing for me.
How can I define the revision=float16 here? Because I am trying to download the GPT-J6B model and it is around 24GB. But with revision=float16 I can get it for 12GB.
I have followed the same steps to download mpt-7B-instruct, but when loading the model I get the following error:
Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack
Any idea why? I have all the files from the repo, but I feel from_pretrained method is assuming I’m loading a different type of model and is looking for a different weights file.
how can I download the models from huggingface directly in my specified local machine directroy rather it downloads automatically into cached location.
The model_id parameter can take a folder location, so if you find out where the model has been downloaded to, you can put that in instead of the model_id.
So if your model was downloaded to c:/llama-chat you would change the line to: model = AutoModelForCausalLM.from_pretrained("c:/llama-chat", device_map="auto")