Softprompt for Llama generating gibberish output

Hello all, I hope this is the right place to ask for help but I’m not sure where else to go. I’m a complete newbie to training / finetuning models, as in, I have NEVER trained or finetuned a model before, and recently I have been trying to train a softprompt for causal LM with a Llama model (meta-llama/Llama-2-7b-chat-hf, to be specific). I’ve been encountering some significant issues. When I train my softprompt, save it, load it and use it for inference, it produces absolute gibberish outputs. For example, it produces strings such as

the the: the : :: :t:m:t :m_ :_:_t_m : _t : m[t]:t[m]:m]t
t [t]m[ :] _ : t[ m] : m _[ t]

or

<…> | blues | gospel | soul | funk | p-funk | punk | new | old | young | youn-d | d-a | da | a-t | b-the | be | e | el | em | er | es | eu | uk | us | u- | k | m-th | h-e | n-he | y-u | s-soul | fu | f-rock | pa | ra | ca | co | os | o | w-end | x-mas <…>

I feel like I followed PEFT documentation pretty well, but since I don’t really know much about training or finetuning models, I wouldn’t be surprised if there are many fundamental errors in my code. To start, I load the model like this:

peft_config = PromptTuningConfig(
    task_type=TaskType.CAUSAL_LM,
    prompt_tuning_init=PromptTuningInit.RANDOM,
    num_virtual_tokens=8,
    tokenizer_name_or_path=model_name_or_path,
)

model_name_or_path = "meta-llama/Llama-2-7b-chat-hf"
tokenizer_name_or_path = "meta-llama/Llama-2-7b-chat-hf"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.float16, device_map="cuda", token=hf_token)
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name_or_path, token=hf_token)
model = get_peft_model(model, peft_config)

Then, my training looks like this:

training_args = TrainingArguments(
    output_dir="./results",
    logging_dir="./logs",
    logging_steps=100,
    save_steps=100,
    eval_steps=100,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=40,
    learning_rate=5e-5,
    warmup_steps=100,
    weight_decay=0.01
)

trainer = Trainer(
    model=model.to(device),
    args=training_args,
    train_dataset=train_tokenized_dataset,
    eval_dataset=dev_tokenized_dataset,
    tokenizer=tokenizer,
    data_collator=default_data_collator
)

trainer.train()

Training arguments are a bit random because I don’t really know what I’m doing. I asked ChatGPT for suggested values and went from there. Some arguments I had to remove because they were causing errors, like load_best_model_at_end = True.

My dataset, that I can’t share fully because it is somewhat sensitive data, contains a column with the prompt and a column with the target response that I tokenize with this function I found online:

tokenizer.pad_token = tokenizer.eos_token
def tokenize_data(batch):
    inputs = tokenizer(batch["input"], truncation=True, padding="max_length", max_length=128)
    labels = tokenizer([str(x) for x in batch["target"]], truncation=True, padding="max_length", max_length=128)
    inputs["labels"] = labels["input_ids"]
    return inputs

And then, I save the softprompt with

model.save_pretrained(...)
tokenizer.save_pretrained(...)

Then, for inference I load the model as specified on huggingface, with

config = PeftConfig.from_pretrained(model_path)
model = PeftModel.from_pretrained(base_model, model_path)
tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)

where model_path points to where I stored the softprompt, and base_model is the base Llama model I’m working with. I use the same function for inference that I’ve used for ages, and with the base Llama model, I have no issues with it. It’s just when I put the softprompt on top that generation fails. I have also noticed the warning Position ids are not supported for parameter efficient tuning. Ignoring position ids. after inference.

I apologize that the code looks messy, I’m not a good coder and have been changing things around for a while to try and fix the issue myself, but to no avail. I’m pretty frustrated and don’t know what the issue is. If anyone has any advice or knows of any tutorials that train softprompts for Causal LM, that would be of incredible help.

1 Like

I think either here, Llama’s GitHub, or Hugging Face Discord would be fine. Hugging Face Discord seems most appropriate…

any tutorials that train softprompts for Causal LM

Hmm… Like these?

Position ids are not supported for parameter efficient tuning. Ignoring position ids

Perhaps it’s not supported yet…?