How pretrained models are trained?

Hello,

How pre-trained are obtained?
In the library there is always some pre-trained model downloaded from the server, but how is it originally trained? Is it also trained using transformers library?

In most if not all built-in cases (e.g. bert-base-cased), the original paper implementations are ported to the transformers architecture. (you can have a look at conversion scripts, e.g. https://github.com/huggingface/transformers/blob/77cd0e13d2d09f60d2f6d8fb8b08f493d7ca51fe/src/transformers/convert_pytorch_checkpoint_to_tf2.py, https://github.com/huggingface/transformers/blob/d155b38d6ea70fef3dec2e1f678269e713672bb7/src/transformers/commands/convert.py)

User models (e.g. username/mymodel-uncased) may have been trained in other ways or ported to the transformers architecture manually, with custom scripts, or they are trained by using the transformers library directly.

Thanks for reply.
How can I train my own model from scratch using transformers?

You can have a look here: https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how_to_train.ipynb

2 Likes