Hello,
I am currently working on a classification problem using ProtBERT and I am following the Fine-Tuning Tutorial. I have called the tokenised using
tokenizer = AutoTokenizer.from_pretrained
and then tokenised like the tutorial says
train_encodings = tokenizer(seq_train, truncation=True, padding=True,
max_length=1024, return_tensors="pt")
Unfortunately, the model doesn’t seem to be learning (I froze the BERT layers). From reading around, I saw that I need to add the [CLS] token and found such an option using
tokenised.encode(add_special_tokens=True)
Yet the tutorial I am following doesn’t seem to require and I was wondering wyy is there a discrepancy and perhaps maybe this is why my model isn’t learning.
Thank you