I’m having an issue with some models and trying to use id2label/label2id in the config for the model. I am downloading the models once and reloading them locally so I don’t have to re-download constantly as I debug and learn the libraries.
Here’s my code:
from transformers import AutoTokenizer, AutoConfig
from transformers import AutoModelForSequenceClassification
model_path = "../model/pretrained/"
#model_name = "distilbert/distilbert-base-multilingual-cased"
#model_name = "distilbert-base-uncased"
model_name = "distilbert/distilbert-base-uncased-finetuned-sst-2-english"
print("Downloading Tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_name)
id2label = {0: 'action', 1: 'adventure', 2: 'crime', 3: 'family', 4: 'fantasy', 5: 'horror', 6: 'mystery', 7: 'romance', 8: 'scifi', 9: 'thriller'}
label2id = {'action': 0, 'adventure': 1, 'crime': 2, 'family': 3, 'fantasy': 4, 'horror': 5, 'mystery': 6, 'romance': 7, 'scifi': 8, 'thriller': 9}
print("Downloading Model")
config = AutoConfig.from_pretrained(model_name, label2id=label2id, id2label=id2label)
model = AutoModelForSequenceClassification.from_pretrained(model_name, config=config)
print("Saving Model")
model.save_pretrained(model_path)
print("Saving Tokenizer")
tokenizer.save_pretrained(model_path)
print("Testing Load from Disk")
print("Loading Tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_path)
print("Loading Config...")
config = AutoConfig.from_pretrained(model_path, local_files_only=True) #, label2id=label2id, id2label=id2label)
print("Loading Model...")
model = AutoModelForSequenceClassification.from_pretrained(
model_path, config=config, local_files_only=True
)
print("Done loading")
The issue is that with the model “distilbert-base-uncased-finetuned-sst-2-english” if I try to use id2label in my config either when I download or reload from disk, I get the following error:
RuntimeError: Error(s) in loading state_dict for Linear:
size mismatch for bias: copying a param with shape torch.Size([2]) from checkpoint, the shape in current model is torch.Size([10]).
Can some models only do binary classification?
With the other models commented out in my code, if I use id2label in the config when I download it, it works and I don’t even need to use it when I reload from disk.
If I don’t add id2label when I download, but use it when I load locally, the program crashes with the same error as above.
Is either behavior how models are supposed to work or am I doing something wrong? The examples for multiclass classification left a lot to be desired (lots of typos and broken code) and nearly entirely focus on binary classifications.