Making a model slightly bigger

kovkev · April 27, 2024, 11:41pm

Hi all!

Let’s say I am working on a transformer model, and it has matrices Q, K and V (and Woutput). Let’s say the embedding_dimension is 100, and then number of features is 100, so each of Q, K, and V would be 100 x 100.

Now, let’s say that this model comes from huggingface and the weights are already trained by someone.

Let’s say I want to modify the model slightly to have the embeddings slightly larger: 110 dimensions, and the features slightly larger: 110 features. Now, each of Q, K, V have size 110 x 110.

How do I initialize the new weights? Should they be random with a normal distribution?

What should I do if instead, I wanted to reduce the number of dimensions, making Q, K, V have size 90 x 90?

What do we call the science of modifying the model shapes?

Thank you,

kovkev

Topic		Replies	Views
Weight duplication/reuse for small model config changes? Beginners	0	289	May 10, 2022
Setting different embedding dim of original model when training Beginners	0	904	June 7, 2023
TimeSeriesTransformerModel dimensionality issue 🤗Transformers	2	32	August 19, 2024
Finetuning model with smaller sequence size and Dmodel Models	0	337	April 15, 2021
Significance of block size Beginners	3	1308	February 17, 2024

Making a model slightly bigger

Related topics