Modifying architecture of the models provided by the library

JohnnySalami · March 5, 2022, 9:06am

Are the models provided here easily modifiable to implement custom changes to the architecture? I plan to: pretrain a GPT model on my native language → add/modify layers but keep trained parameters - > finetune the model.

I am adding/modifying intermediate layers and not adding layers after/before the model.

Is it fine to modify the code and it will not break anything?

sheoran95 · March 13, 2023, 11:06am

Hey man, did you find a solution to this? I want to load the pretrained model weights and then add a layer after the encoder of the T5 model for translation task. I’m not sure how to edit the architecture of the given models.
Pls share if you did!

Topic		Replies	Views
Customizing model architecture from predefined models 🤗Transformers	0	354	March 13, 2024
Resources for model design (number of layers, attention heads, etc) Beginners	2	606	January 4, 2021
Which weights change when fine-tunning a pre-trained model? Intermediate	3	765	June 11, 2024
Adding custom layer to GPT-2 Models	0	458	September 27, 2022
Fine Tuning GPT2 for machine translation 🤗Transformers	1	4771	May 2, 2021

Modifying architecture of the models provided by the library

Related topics