Hey all,
I’m trying to use declare-lab/flan-alpaca-xl
(link here: declare-lab/flan-alpaca-xl at main) in Transformers using the following code in a locally run Jupyter Notebook:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("declare-lab/flan-alpaca-xl")
model = AutoModelForSeq2SeqLM.from_pretrained("declare-lab/flan-alpaca-xl")
However I am getting the error
404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-xl/resolve/main/pytorch_model.bin
OSError: Can't load weights for 'declare-lab/flan-alpaca-xl'. Make sure that:
- 'declare-lab/flan-alpaca-xl' is a correct model identifier listed on 'https://huggingface.co/models'
- or 'declare-lab/flan-alpaca-xl' is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt.
Trying it using the high level helper:
from transformers import pipeline
prompt = "Write an email about an alpaca that likes flan"
model = pipeline(task="text2text-generation", model="declare-lab/flan-alpaca-xl", )
model(prompt, max_length=128, do_sample=True)
Also produces the same 404 errors:
404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-gpt4-xl/resolve/main/pytorch_model.bin
404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-gpt4-xl/resolve/main/tf_model.h5
404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-gpt4-xl/resolve/main/pytorch_model.bin
404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-gpt4-xl/resolve/main/tf_model.h5
I have noticed that the model is sharded in declare-lab/flan-alpaca-xl
. This post suggests using the latest transformers
, however despite using the latest version of transformers
from pip (4.31.0
) it is unable to detect this sharded model.
Currently using Python 3.8.11 on a Mac M1.
Any ideas on how to solve this?
Will I need to fork the repo and merge the two shards together into one model?