Softprompt for Llama generating gibberish output

lenaso · May 2, 2025, 8:38pm

Hello all, I hope this is the right place to ask for help but I’m not sure where else to go. I’m a complete newbie to training / finetuning models, as in, I have NEVER trained or finetuned a model before, and recently I have been trying to train a softprompt for causal LM with a Llama model (meta-llama/Llama-2-7b-chat-hf, to be specific). I’ve been encountering some significant issues. When I train my softprompt, save it, load it and use it for inference, it produces absolute gibberish outputs. For example, it produces strings such as

the the: the : :: :t:m:t :m_ :_:_t_m : _t : m[t]:t[m]:m]t
t [t]m[ :] _ : t[ m] : m _[ t]

or

<…> | blues | gospel | soul | funk | p-funk | punk | new | old | young | youn-d | d-a | da | a-t | b-the | be | e | el | em | er | es | eu | uk | us | u- | k | m-th | h-e | n-he | y-u | s-soul | fu | f-rock | pa | ra | ca | co | os | o | w-end | x-mas <…>

I feel like I followed PEFT documentation pretty well, but since I don’t really know much about training or finetuning models, I wouldn’t be surprised if there are many fundamental errors in my code. To start, I load the model like this:

peft_config = PromptTuningConfig(
    task_type=TaskType.CAUSAL_LM,
    prompt_tuning_init=PromptTuningInit.RANDOM,
    num_virtual_tokens=8,
    tokenizer_name_or_path=model_name_or_path,
)

model_name_or_path = "meta-llama/Llama-2-7b-chat-hf"
tokenizer_name_or_path = "meta-llama/Llama-2-7b-chat-hf"
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, torch_dtype=torch.float16, device_map="cuda", token=hf_token)
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name_or_path, token=hf_token)
model = get_peft_model(model, peft_config)

Then, my training looks like this:

training_args = TrainingArguments(
    output_dir="./results",
    logging_dir="./logs",
    logging_steps=100,
    save_steps=100,
    eval_steps=100,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=40,
    learning_rate=5e-5,
    warmup_steps=100,
    weight_decay=0.01
)

trainer = Trainer(
    model=model.to(device),
    args=training_args,
    train_dataset=train_tokenized_dataset,
    eval_dataset=dev_tokenized_dataset,
    tokenizer=tokenizer,
    data_collator=default_data_collator
)

trainer.train()

Training arguments are a bit random because I don’t really know what I’m doing. I asked ChatGPT for suggested values and went from there. Some arguments I had to remove because they were causing errors, like load_best_model_at_end = True.

My dataset, that I can’t share fully because it is somewhat sensitive data, contains a column with the prompt and a column with the target response that I tokenize with this function I found online:

tokenizer.pad_token = tokenizer.eos_token
def tokenize_data(batch):
    inputs = tokenizer(batch["input"], truncation=True, padding="max_length", max_length=128)
    labels = tokenizer([str(x) for x in batch["target"]], truncation=True, padding="max_length", max_length=128)
    inputs["labels"] = labels["input_ids"]
    return inputs

And then, I save the softprompt with

model.save_pretrained(...)
tokenizer.save_pretrained(...)

Then, for inference I load the model as specified on huggingface, with

config = PeftConfig.from_pretrained(model_path)
model = PeftModel.from_pretrained(base_model, model_path)
tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)

where model_path points to where I stored the softprompt, and base_model is the base Llama model I’m working with. I use the same function for inference that I’ve used for ages, and with the base Llama model, I have no issues with it. It’s just when I put the softprompt on top that generation fails. I have also noticed the warning Position ids are not supported for parameter efficient tuning. Ignoring position ids. after inference.

I apologize that the code looks messy, I’m not a good coder and have been changing things around for a while to try and fix the issue myself, but to no avail. I’m pretty frustrated and don’t know what the issue is. If anyone has any advice or knows of any tutorials that train softprompts for Causal LM, that would be of incredible help.

John6666 · May 3, 2025, 2:42am

I think either here, Llama’s GitHub, or Hugging Face Discord would be fine. Hugging Face Discord seems most appropriate…

any tutorials that train softprompts for Causal LM

Hmm… Like these?

John6666 · May 3, 2025, 2:47am

Position ids are not supported for parameter efficient tuning. Ignoring position ids

Perhaps it’s not supported yet…?

github.com/huggingface/transformers

Customized position_ids not working

opened 02:15AM - 04 Oct 24 UTC

closed 08:04AM - 12 Nov 24 UTC

huchanwei123

bug

### System Info Hello, I am trying to feed a customized position IDs to Llam…a model. If I fed a customized position_ids vector, for example, [[0, 0, 1, 2, 2, 2]] means batch size = 1, 1st and 2nd tokens share the same position, and 3rd-5th tokens share the same position 2, this will cause an error. The error seems to be located in the function `prepare_inputs_for_generation` in `src/transformers/models/llama/modeling_llama.py`, where the `position_ids` does not change as the `cache_position` increase, so the shape inconsistency occurs. Is there any way to successfully feed a customized position ids to the model? Thanks! ### Who can help? _No response_ ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below) ### Reproduction 1. Give a customized `position_ids` as one of the input in `model.generate()` ### Expected behavior Size mismatch

Topic		Replies	Views
Prompt printing gibberish Beginners	1	692	September 15, 2023
Fine tuning GPT2 on persona chat dataset outputs gibberish Models	1	2747	April 14, 2021
LLaMa3.1 8B Instruct Prompt Tuning for Text Classification doesn't improve test accuracy Models	3	887	October 1, 2024
Error with get_peft_model() and PromptTuningConfig 🤗Transformers	1	1551	November 6, 2023
Making fine-tuned LLM model more stable Beginners	3	1013	December 30, 2023

Related topics