Prompt Tuning For Sequence Classification

I am trying the prompt tuning for Hate Speech Classification. I have gone through several papers and found that there are two types of prompts- Hard prompts, and Soft prompts.

I am trying hard prompts for my task.

How can we use PEFT model for our task?

If not, then what are other possible ways for Prompt tuning for BERT Model?

No one could possibly answer your question as you have simply not provided any useful details.

Sorry for not providing the code as I was not implemented the code.

My implementation is:

from transformers import AutoModelForSequenceClassification,  AutoConfig
from peft import PeftModelForSequenceClassification, get_peft_config

config = AutoConfig.from_pretrained("google/muril-base-cased")

config = {
    "peft_type": "PREFIX_TUNING",
    "task_type": "SEQ_CLS",
    "inference_mode": False,
    "num_virtual_tokens": 20,
    "token_dim": 768,
    "num_transformer_submodules": 1,
    "num_attention_heads": 12,
    "num_layers": 12,
    "encoder_hidden_size": 768,
    "prefix_projection": False,

peft_config = get_peft_config(config)

model = AutoModelForSequenceClassification.from_pretrained(PRE_TRAINED_MODEL_NAME, num_labels=len(class_names))
peft_model = PeftModelForSequenceClassification(model, peft_config)
peft_model =

My training function is:

def train_epoch(model, data_loader, loss_fn, optimizer, device, scheduler, n_examples):
    model = model.train()

    losses = []
    correct_predictions = 0
    progress_bar = tqdm(range(num_training_steps))

    for d in data_loader:
        input_ids = d["input_ids"].to(device)
        attention_mask = d["attention_mask"].to(device)
        targets = d["targets"].to(device)

        outputs = model(

        outputs = F.softmax(outputs.logits, dim=-1)

        _, preds = torch.max(outputs, dim=1)
        loss = loss_fn(outputs, targets) #.unsqueeze(1))        
        correct_predictions += torch.sum(preds == targets)

        torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)

    return correct_predictions.double() / n_examples, np.mean(losses)

Hyperparameters setup

patience = 10

optimizer = AdamW(model.parameters(), lr=2e-5, correct_bias=False)

num_training_steps = EPOCHS * len(train_data_loader)

scheduler = get_linear_schedule_with_warmup(

loss_fn = torch.nn.CrossEntropyLoss(weight=class_weights).to(device)

Why the training is stuck to 0.632 validation accuracy?
Am I doing something wrong?

Epoch 1/100
100%|██████████| 36600/36600 [01:28<00:00, 413.98it/s]Train loss 0.6931820511817932 accuracy 0.5681176772427948

Val   loss 0.6930877877318341 accuracy 0.6320109439124487

Best model found in Epoch 1 with val_loss 0.6930877877318341


Epoch 2/100
100%|██████████| 36600/36600 [01:29<00:00, 407.90it/s]Train loss 0.6931309700012207 accuracy 0.6001881467544685

Val   loss 0.6930594962576161 accuracy 0.6320109439124487

Best model found in Epoch 2 with val_loss 0.6930594962576161


Epoch 3/100
100%|██████████| 36600/36600 [01:29<00:00, 408.18it/s]Train loss 0.6931337714195251 accuracy 0.6069443256649277

Val   loss 0.6930221902287524 accuracy 0.6320109439124487

Best model found in Epoch 3 with val_loss 0.6930221902287524


Epoch 4/100
 72%|███████▏  | 26400/36600 [01:04<00:24, 408.29it/s]Train loss 0.6931185126304626 accuracy 0.6079705806893013

Val   loss 0.6930161818214085 accuracy 0.6320109439124487

Best model found in Epoch 4 with val_loss 0.6930161818214085


Epoch 5/100
100%|██████████| 36600/36600 [01:29<00:00, 408.66it/s]Train loss 0.6931254863739014 accuracy 0.6085692294535192

Val   loss 0.6930019298325414 accuracy 0.6320109439124487

Best model found in Epoch 5 with val_loss 0.6930019298325414


Accuracy is a discrete metric (unlike pretty much all loss metrics used with neural networks) and thus it’s entirely possible that your loss could be decreasing while your accuracy is flatlining. Obviously, for any model, neither validation loss nor accuracy will ever reach 0/100%, respectively, unless your validation data is your training data. The point of the validation data is to help us decide when to stop training- if your training loss is decreasing but your validation loss has flatlined or is even increasing for multiple epochs, you are likely only going to overfit if you keep training.

Have you tried out different learning rates? Prompt Tuning sometimes requires surprisingly high learning rates of for instance 0.3. This post is quite good in my opinion: Guiding Frozen Language Models with Learned Soft Prompts – Google AI Blog

Which dataset are you using?

I have a question: If I use PEFT, Do I have to change the Bert tokenizer to include the prompt tokens?