Hi all!
Let’s say I am working on a transformer model, and it has matrices Q, K and V (and Woutput). Let’s say the embedding_dimension is 100, and then number of features is 100, so each of Q, K, and V would be 100 x 100.
Now, let’s say that this model comes from huggingface and the weights are already trained by someone.
Let’s say I want to modify the model slightly to have the embeddings slightly larger: 110 dimensions, and the features slightly larger: 110 features. Now, each of Q, K, V have size 110 x 110.
How do I initialize the new weights? Should they be random with a normal distribution?
What should I do if instead, I wanted to reduce the number of dimensions, making Q, K, V have size 90 x 90?
What do we call the science of modifying the model shapes?
Thank you,
kovkev