I’m attempting to fine-tune a Decicoder model on my own data and encountering an error.
My issue is that I’m not able to find documentation on the naive_attention_prefill attribute that the error refers to and so don’t actually understand how to fix.
import datasets
from transformers import AutoModelForCausalLM, AutoTokenizer, DataCollatorWithPadding, TrainingArguments, Trainer, AutoConfig
checkpoint = "Deci/DeciCoder-1b"
device = "cpu"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(checkpoint, trust_remote_code=True).to(device)
# Using dummy data since just trying to get the architecture to work
train_list = [{'text': """;lkj;kjlkj;lkj;kjasdfasdfasdfadsf"""},
{'text': """asdfasdfafds;lj;lkj;lkj;salkjfd;akdsjf;asdf"""}]
test_list = [{'text': "asdfasdfadsfasdfasdfasdfasdfasdfadsf"},
{'text': """etrpoiqetrpqetqpoiewr.nn,n.,n.n.asdfasdqewr"""}]
dataset_prepped = datasets.DatasetDict({
'train': datasets.Dataset.from_list(train_list),
'test': datasets.Dataset.from_list(test_list),
}
)
def tokenize_function(example):
return tokenizer(example['text'], padding="max_length", truncation=True)
tokenized_dataset = dataset_prepped.map(tokenize_function, batched=True)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
training_args = TrainingArguments("test-trainer")
trainer = Trainer(
model,
training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["test"],
data_collator=data_collator,
tokenizer=tokenizer
)
trainer.train()
Error:
ValueError: For support of custom attention masks please set naive_attention_prefill to True in the config
I did attempt to do as the suggestion
config = AutoConfig.from_pretrained(checkpoint, naive_attention_prefill = True)
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(checkpoint, trust_remote_code=True, config = config).to(device)
However, get the following error.
ValueError: The model class you are passing has a `config_class` attribute that is not consistent with the config class you passed (model has and you passed . Fix one of those so they match!
It seems that the original model wasn’t built with that naive_attention_prefill set to True. So I would like to resolve this, but also just understand where the documentation for this attribute is - I haven’t found it anywhere.