Regenerate Prompt tuning result with appended prompt on base model

Hello everyone

I’ve followed the Prompt Tuning tutorial from HF, successfully applied it on a model - the implementation is based of a PEFT model.

What I’ve tried to do next is to “get the learned prompt” and apply it as regular prompt on the same model, unfortunately this didn’t work for me, any advice on what could be done here would be great (other than using the PEFT model would).
At a high level what I want to do is:
for input text: TEXT

  1. Train a PEFT model on bloomz-7b, run it to get OUTPUT: model(TEXT) -> OUTPUT
  2. Get the learned prompt from the model: PROMPT
  3. run the base model to get the same output: bloomz-7b(PROMPT + TEXT) -> OUTPUT

Next I’ll take you through the steps I’ve taken and the output I get
0. The setup:

  • modle_name = “bigscience/bloomz-7b1-mt”
  • initial prompt = “I don’t know what I’m doing”
  • num_virtual_tokens = 16
  • TEXT = “Tweet text : @HMRCcustomers No this is my first job Label :”
  • peft_config = PromptTuningConfig( task_type=TaskType.CAUSAL_LM, prompt_tuning_init=PromptTuningInit.TEXT...)
  1. I’ve reused the code in the tutorial and trained the PEFT model for 30 iterations, applying the PEFT model on TEXT yielded:
    Tweet text : @nationalgridus I have no water and the bill is current and paid. Can you do something about this? Label : complaint
    Nice!

  2. I’ve used model.prepare_inputs_for_generation to get the embedding of all tokenizer vocabulary and the learned prompt. Applying this on a single token results in 1+num_virtual_tokens tokens, I’ve used this to embed the entire vocab (for a single token - run the prepare_inputs_for_generation and take the 17th vector), and the embedding of the 16 virutal tokens (simply take the prepended embedding vectors)

  3. I’ve found the most similar token in the vocab to each of the learned prompt tokens using cosine similarity- this didn’t yield a 1.0 match for everyone, all identified tokens had at least 0.99 similarity.

  4. Generation of the PROMPT - I’ve taken the token ID of each prompt token, created a list of 16 tokens and ran tokenizer.decode on it - resulting in a string which is not all English which is the PROMPT. I’ve tested that encoding the PROMPT with the tokenizer results exactly in 16 tokens.

  5. Running PROMPT + TEXT through the base model - I’ve concatenated PROMPT + TEXT, tokenized it and ran it through the base model, exactly in the same manner as before but with the HF bloomz-7b1-mt, the output wasn’t the same as in step 1:
    <PROMPT>Tweet text : "@nationalgridus I have no water and the bill is current and paid. Can you do something about this?" Label : 移民The present invention relates to a method of'

Any ideas what I’m doing wrong here?
Thanks!