Pruning a model embedding matrix for memory efficiency

IamAdiSri · April 15, 2021, 5:25am

So, I had success in getting this to work! I was able to prune the embedding matrix and lm heads to less than a tenth of their sizes. On testing a couple of samples in Hindi to English translation I saw no difference in the translations between the stock model’s inference and the pruned model’s inference.

Btw, I’m stuck at step 4 where I need to make a new Tokenizer for the vocabulary (subword token to index mapping) that I’ve generated. I’m trying to use MBart50TokenizerFast for the same, and currently using dictionaries to map old indices to new indices. I’d really appreciate if you could point me in the right direction.

Topic		Replies	Views
mBART embedding matrix prunning Intermediate	0	527	May 11, 2021
Tiny mBART doc/info 🤗Transformers	14	2196	August 7, 2020
How to finetune MBART on an single language? Models	0	396	December 17, 2022
Train new Word Embedding for mBART Models	1	347	November 3, 2023
How to train new token embedding to add to a pretrain model? 🤗Transformers	1	3644	January 6, 2021

Pruning a model embedding matrix for memory efficiency

Related topics