How to install the model correctly?

DaveBowman · May 28, 2023, 8:08pm

Greetings, I am trying to run the model according to the manual: Installation, but I ran into a problem. Below I describe my steps.

Operating system: Ubuntu 20.04.6 installed on WSL

Updated the packages:

sudo apt update
sudo apt upgrade -y

Installed Python:

sudo apt-get update
sudo apt-get install git curl python3-pip make gcc libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python-openssl
curl https://pyenv.run | bash
export PATH="$HOME/.pyenv/bin:$PATH" && eval "$(pyenv init --path)" && echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init -)"\nfi' >> ~/.bashrc
pyenv install 3.11.3
pyenv global 3.11.3

Created and activated the virtual environment:

python -m venv .env
source .env/bin/activate

Installed TensorFlow:

pip install --upgrade tensorflow

Installed PyTorch:

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Installed Flax:

pip install --upgrade git+https://github.com/google/flax.git

Installed transformers:

pip install transformers

Next I try to download and run the model:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("bigscience/T0_3B")
model = AutoModelForSeq2SeqLM.from_pretrained("bigscience/T0_3B")

The line

tokenizer = AutoTokenizer.from_pretrained("bigscience/T0_3B")

throws an error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dave/.env/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 711, in from_pretrained
    return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dave/.env/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1812, in from_pretrained
    return cls._from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/dave/.env/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 1975, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dave/.env/lib/python3.11/site-packages/transformers/models/t5/tokenization_t5_fast.py", line 133, in __init__
    super().__init__(
  File "/home/dave/.env/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 120, in __init__
    raise ValueError(
ValueError: Couldn't instantiate the backend tokenizer from one of:
(1) a `tokenizers` library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

What am I doing wrong?

Topic		Replies	Views
Unable to train model (Loss is 0.000000) DeepSpeed	2	1091	October 17, 2023
How to properly wrap a model for training with accelerate? 🤗Accelerate	1	1298	September 20, 2023
When trying to run model I get model_type is not defined Beginners	3	20	April 13, 2025
Can't save my finetuned model Beginners	5	207	November 9, 2024
Error while trying to Load the "deepseek-ai/DeepSeek-V3" model Awesome paper	3	455	April 14, 2025

How to install the model correctly?

Related topics