Using transformers (BERT, RoBERTa) without embedding layer

Hi @tueboesen,

Yes, it will work. It can give you a very close results compared to MSA methods, sometimes even better results. If you combine it with MSA, it will even give you a better results compared to MSA methods alone.

We have trained (Transformer XL, XLNet, Bert, Albert, Electra and T5) for Uniref100 and BFD dataset. I would recommend to simply use on of these models, because it requires tremendous amount of computing power to reach good results.

You can find them here:

You can find more details on our paper:

Facebook also trained Roberta using Unrief50 dataset:

Unfortunately, we don’t have a notebook for training from scratch, but you can find more details to replicate our results here:

@patrickvonplaten :
You meant :

Not :

:slight_smile:

ProtTrans: Provides the SOT pre-trained models for protein sequences.
CodeTrans: Provides the SOTpre-trained models for computer source code.

1 Like