Pruning a model embedding matrix for memory efficiency

IamAdiSri · April 17, 2021, 10:49am

I mentioned before that I got it to work, but it seems that while inference works perfectly, training doesn’t. When I try to train the model I get completely garbage results and I can’t really tell why. The training loss and validation loss are extremely small (around 2e-3) but the ROUGE scores I’m calculating are also abysmal (approx 2e-4). Lastly, training the model for one epoch makes it completely forget how to translate between the two languages I have it pruned for.

Topic		Replies	Views
mBART embedding matrix prunning Intermediate	0	527	May 11, 2021
Tiny mBART doc/info 🤗Transformers	14	2195	August 7, 2020
How to finetune MBART on an single language? Models	0	396	December 17, 2022
Train new Word Embedding for mBART Models	1	347	November 3, 2023
How to train new token embedding to add to a pretrain model? 🤗Transformers	1	3642	January 6, 2021

Pruning a model embedding matrix for memory efficiency

Related topics