I created a tar file out of a unsloth fine-tuned model(base-model: unsloth/gemma-2b-bnb-4bit) using PEFT and pushed it to gcsBucket. I am downloading the artifacts from gcs bucket, extracting the files and locally saving them into a folder. Now, I am trying to load the model using this local directory path, but getting error : Exception: expected value at line 1 column 1
model, tokenizer = FastLanguageModel.from_pretrained(
model_name ='./model',
max_seq_length = max_seq_length,
dtype = None,
load_in_4bit = True,
local_files_only = True,
)
FastLanguageModel.for_inference(model)
How can I load the model from the given directory?
You might have some issues with this process. Can you please check the size of config.json, tokenizer.json and tokenizer.model:
opened 05:21AM - 27 Mar 23 UTC
closed 03:02PM - 30 May 23 UTC
### System Info
- `transformers` version: 4.28.0.dev0
- Platform: Linux-5.4.0-… 144-generic-x86_64-with-glibc2.31
- Python version: 3.9.16
- Huggingface_hub version: 0.13.2
- PyTorch version (GPU?): 2.0.0+cu117 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
### Who can help?
@ArthurZucker
@sgugger
@gante
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)
### Reproduction
File "/mnt1/wcp/BEELE/BELLE-main/generate_instruction.py", line 28, in
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1804, in from_pretrained
return cls._from_pretrained(
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/models/bloom/tokenization_bloom_fast.py", line 118, in init
super().init(
File "/home/appuser/miniconda3/envs/wcppy39/lib/python3.9/site-packages/transformers/tokenization_utils_fast.py", line 111, in init
fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: expected value at line 1 column 1
### Expected behavior
i hope the file is run
All the model files are of valid size. Interestingly creating a zip and unzipping it back is doing all the magic.If I simply do git clone <huggingface_model_uri> and then provide the local path while loading model it works.