Hi, everyone! I have been trying to upload a quantized model to huggingface using the push_to_hub function but I always receive this error:
AttributeError: 'str' object has no attribute 'data_ptr'
Here’s the code:
from transformers import T5Tokenizer, T5ForConditionalGeneration
from quanto import quantize, freeze, qint8
model_id = "google/flan-t5-base"
quantized_model = T5ForConditionalGeneration.from_pretrained(model_id, low_cpu_mem_usage=True, use_safetensors=True)
quantize(quantized_model, weights=qint8, activaions=None)
freeze(quantized_model)
tokenizer.push_to_hub("flan-t5-base-8bit")
quantized_model.push_to_hub("flan-t5-base-8bit")
Thank you in advance!
nielsr
April 29, 2024, 7:39am
2
did you ever solve this? i also get it when trying to .save_pretrained
after the freeze
There are too many causes of AttributeError to pinpoint, but it is said that it can be caused by too old or too new a version of the transformers library.
opened 02:03PM - 12 Jul 23 UTC
closed 05:57PM - 12 Jul 23 UTC
### System Info
===================================BUG REPORT================… ===================
Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
================================================================================
bin /opt/conda/envs/pytorch/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so
CUDA SETUP: CUDA runtime path found: /opt/conda/envs/pytorch/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /opt/conda/envs/pytorch/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so...
[2023-07-12 13:52:54,626] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
- `transformers` version: 4.30.2
- Platform: Linux-5.15.0-1038-aws-x86_64-with-glibc2.31
- Python version: 3.10.12
- Huggingface_hub version: 0.16.2
- Safetensors version: 0.3.1
- PyTorch version (GPU?): 2.0.1 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
### Who can help?
_No response_
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)
### Reproduction
Hi,
I'm trying to save quantized model. First attempt didn't work. (I also opened an issue, https://github.com/huggingface/accelerate/issues/1713, to clarify it).
I opened this issue because I'm receiving an error message when I run following code. I'm not sure I'm following the right instructions written on https://huggingface.co/docs/transformers/main_classes/quantization. Because model is pushed to hub in documentation. But I expect to save it to local filesystem. Thanks for your help in advance.
```
### load packages ###
import transformers
import textwrap
from transformers import LlamaTokenizer, LlamaForCausalLM
import os
import sys
from typing import List
import accelerate
from peft import (
LoraConfig,
get_peft_model,
get_peft_model_state_dict,
prepare_model_for_int8_training,
)
#import fire
import torch
from datasets import load_dataset
import pandas as pd
import deepspeed
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
DEVICE
### load model ###
BASE_MODEL = "decapoda-research/llama-7b-hf"
model = LlamaForCausalLM.from_pretrained(
BASE_MODEL,
load_in_8bit=True,
torch_dtype=torch.float16,
device_map="auto",
)
model.save_pretrained(save_directory="quantized_decapoda-research_llama-7b-hf_v2")
```
Error Message:
```
/opt/conda/envs/pytorch/lib/python3.10/site-packages/transformers/modeling_utils.py:1709: UserWarning: You are calling `save_pretrained` to a 8-bit converted model you may likely encounter unexepected behaviors. If you want to save 8-bit models, make sure to have `bitsandbytes>0.37.2` installed.
warnings.warn(
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[3], line 1
----> 1 model.save_pretrained(save_directory="quantized_decapoda-research_llama-7b-hf_v2")
File /opt/conda/envs/pytorch/lib/python3.10/site-packages/transformers/modeling_utils.py:1820, in PreTrainedModel.save_pretrained(self, save_directory, is_main_process, state_dict, save_function, push_to_hub, max_shard_size, safe_serialization, variant, **kwargs)
1817 weights_name = SAFE_WEIGHTS_NAME if safe_serialization else WEIGHTS_NAME
1818 weights_name = _add_variant(weights_name, variant)
-> 1820 shards, index = shard_checkpoint(state_dict, max_shard_size=max_shard_size, weights_name=weights_name)
1822 # Clean the folder from a previous save
1823 for filename in os.listdir(save_directory):
File /opt/conda/envs/pytorch/lib/python3.10/site-packages/transformers/modeling_utils.py:318, in shard_checkpoint(state_dict, max_shard_size, weights_name)
315 storage_id_to_block = {}
317 for key, weight in state_dict.items():
--> 318 storage_id = id_tensor_storage(weight)
320 # If a `weight` shares the same underlying storage as another tensor, we put `weight` in the same `block`
321 if storage_id in storage_id_to_block:
File /opt/conda/envs/pytorch/lib/python3.10/site-packages/transformers/pytorch_utils.py:290, in id_tensor_storage(tensor)
283 def id_tensor_storage(tensor: torch.Tensor) -> Tuple[torch.device, int, int]:
284 """
285 Unique identifier to a tensor storage. Multiple different tensors can share the same underlying storage. For
286 example, "meta" tensors all share the same storage, and thus their identifier will all be equal. This identifier is
287 guaranteed to be unique and constant for this tensor's storage during its lifetime. Two tensor storages with
288 non-overlapping lifetimes may have the same id.
289 """
--> 290 return tensor.device, storage_ptr(tensor), storage_size(tensor)
AttributeError: 'str' object has no attribute 'device'
```
### Expected behavior
Save quantized model to local filesystem.