The more I read papers, the more frequently I encounter that authors, let`s say, implement a custom attention mechanism and measure its performance in a trained original model (T5 for instance). It is not clear to me how can I load weights to a model that has a slightly different structure than the original one. So is there a way to load weights only to those layers that are identical between an original model and a custom?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
How to load pretrained model with custom model layers | 2 | 1111 | September 12, 2023 | |
Darshan Hiranandani : How to Replace Specific Layer Weights in One Model with Weights from Another Model? | 0 | 29 | January 2, 2025 | |
Attention weights transfer but different classes | 0 | 222 | February 21, 2023 | |
How to customize "from_pretrained" | 1 | 423 | April 6, 2024 | |
How to add additional module to BERT architecture, then load the original weight and use it | 0 | 464 | May 20, 2022 |