Firs of all, I do really appreciate HF absolute great works.
I’m currently trying trainer and it’s great and efficient anyways
So, I’m trying to use Bert and some variance versions and doing two sequences similarity classification.
I Have some datasets of eqaul meaning sentences and made dataset for 1 or 0 for ‘similarity’ for two sentences. 1 for eqaul sentences and 0 for two different meaning sentences
I used AutoModelForSequcneClassification and it worked well.
I put two sequences like this, and worked well.
outputs = tokenizer(examples[‘code1’], examples[‘code2’], padding=False, max_length=MAX_LEN, truncation=True)
but some pretrained models which don’t accept this type of model definiton so I’m trying to make it by myself.
So encoder is pretrained model and decoder will be my neural net which will make it predict 0 or 1
My question is How AutoModelForSequcneClassification works?
So there are special tokens cls which is located at front and used for classification
but my case I have two sequence of tokens, so [cls] + [first tokens] + [sep] + [second tokens]?
Is it working like [cls] and [sep] token cosine similarity ?
or first second tokens get avg pooling and having similarity check?
I really wanna know how AutoModelForSequcneClassification works in the case of two sequences similarity check in detail to make my own version
Second quesiton)) I’m using callbacks = [EarlyStoppingCallback(early_stopping_patience=10)]
and curious is there any option to make it work like patience reset if better result I get?