`flan-alpaca-xl` model does not appear to have a file named `pytorch_model.bin` despite sharded model present

Hey all,

I’m trying to use declare-lab/flan-alpaca-xl (link here: declare-lab/flan-alpaca-xl at main) in Transformers using the following code in a locally run Jupyter Notebook:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("declare-lab/flan-alpaca-xl")
model = AutoModelForSeq2SeqLM.from_pretrained("declare-lab/flan-alpaca-xl")

However I am getting the error

404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-xl/resolve/main/pytorch_model.bin
OSError: Can't load weights for 'declare-lab/flan-alpaca-xl'. Make sure that:

- 'declare-lab/flan-alpaca-xl' is a correct model identifier listed on 'https://huggingface.co/models'

- or 'declare-lab/flan-alpaca-xl' is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt.

Trying it using the high level helper:

from transformers import pipeline

prompt = "Write an email about an alpaca that likes flan"
model = pipeline(task="text2text-generation", model="declare-lab/flan-alpaca-xl", )
model(prompt, max_length=128, do_sample=True)

Also produces the same 404 errors:

404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-gpt4-xl/resolve/main/pytorch_model.bin
404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-gpt4-xl/resolve/main/tf_model.h5
404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-gpt4-xl/resolve/main/pytorch_model.bin
404 Client Error: Not Found for url: https://huggingface.co/declare-lab/flan-alpaca-gpt4-xl/resolve/main/tf_model.h5

I have noticed that the model is sharded in declare-lab/flan-alpaca-xl. This post suggests using the latest transformers, however despite using the latest version of transformers from pip (4.31.0) it is unable to detect this sharded model.

Currently using Python 3.8.11 on a Mac M1.

Any ideas on how to solve this?

Will I need to fork the repo and merge the two shards together into one model?