Relation between PEFT model and a regular model with prompt

antonpuz · August 8, 2023, 9:44am

Hello everyone

I’ve followed the Prompt Tuning tutorial from HF, successfully applied it on a model - the implementation is based of a PEFT model.

What I’ve tried to do next is to “get the learned prompt” and apply it as regular prompt on the same model, unfortunately this didn’t work for me, any advice on what could be done here would be great (other than using the PEFT model would).
At a high level what I want to do is:
for input text: TEXT

Train a PEFT model on bloomz-7b, run it to get OUTPUT: model(TEXT) -> OUTPUT
Get the learned prompt from the model: PROMPT
run the base model to get the same output: bloomz-7b(PROMPT + TEXT) -> OUTPUT

Next I’ll take you through the steps I’ve taken and the output I get
0. The setup:

modle_name = “bigscience/bloomz-7b1-mt”
initial prompt = “I don’t know what I’m doing”
num_virtual_tokens = 16
TEXT = “Tweet text : @HMRCcustomers No this is my first job Label :”
peft_config = PromptTuningConfig( task_type=TaskType.CAUSAL_LM, prompt_tuning_init=PromptTuningInit.TEXT...)

I’ve reused the code in the tutorial and trained the PEFT model for 30 iterations, applying the PEFT model on TEXT yielded:
Tweet text : @nationalgridus I have no water and the bill is current and paid. Can you do something about this? Label : complaint
Nice!
I’ve used model.prepare_inputs_for_generation to get the embedding of all tokenizer vocabulary and the learned prompt. Applying this on a single token results in 1+num_virtual_tokens tokens, I’ve used this to embed the entire vocab (for a single token - run the prepare_inputs_for_generation and take the 17th vector), and the embedding of the 16 virutal tokens (simply take the prepended embedding vectors)
I’ve found the most similar token in the vocab to each of the learned prompt tokens using cosine similarity- this didn’t yield a 1.0 match for everyone, all identified tokens had at least 0.99 similarity.
Generation of the PROMPT - I’ve taken the token ID of each prompt token, created a list of 16 tokens and ran tokenizer.decode on it - resulting in a string which is not all English which is the PROMPT. I’ve tested that encoding the PROMPT with the tokenizer results exactly in 16 tokens.
Running PROMPT + TEXT through the base model - I’ve concatenated PROMPT + TEXT, tokenized it and ran it through the base model, exactly in the same manner as before but with the HF bloomz-7b1-mt, the output wasn’t the same as in step 1:
<PROMPT>Tweet text : "@nationalgridus I have no water and the bill is current and paid. Can you do something about this?" Label : 移民The present invention relates to a method of'

Any ideas what I’m doing wrong here?
Thanks!

Topic		Replies	Views
Regenerate Prompt tuning result with appended prompt on base model Intermediate	0	881	August 6, 2023
Prompt Tuning For Sequence Classification Models	5	2064	December 19, 2023
How to use the model resulted from PEFT for inference Beginners	2	1046	June 2, 2024
Error with get_peft_model() and PromptTuningConfig 🤗Transformers	1	1547	November 6, 2023
Peft Prompt Tuning - ValueError: `create_and_replace` does not support prompt learning and adaption prompt yet Models	0	500	January 4, 2024

Relation between PEFT model and a regular model with prompt

Related topics