The more I read papers, the more frequently I encounter that authors, let`s say, implement a custom attention mechanism and measure its performance in a trained original model (T5 for instance). It is not clear to me how can I load weights to a model that has a slightly different structure than the original one. So is there a way to load weights only to those layers that are identical between an original model and a custom?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
How to load only a part of pretrained weights? | 0 | 105 | July 9, 2024 | |
Loading Weights from Customized Model | 0 | 610 | September 6, 2023 | |
Load custom model trained with trainer | 0 | 223 | August 31, 2023 | |
Is it possible to reuse weights from a model with different dimensions? | 0 | 652 | January 18, 2022 | |
Loading weights of specific layer of gpt2 pretrained model | 0 | 207 | December 12, 2023 |