The question is as it is. I want to train a small language model but I couldn’t understand mathematically whether a random embedding model would decrease performance.
I wonder if the embedding sizes are set in the code before being fed to the model? Otherwise, I think there should be an error in the matrix size.
1 Like
No, you cannot use any random embedding model without matching the embedding size to what your language model expects. The embedding dimension must match exactly, or you’ll get a shape error.
If the dimensions match, training with a random embedding is possible, but performance will be much worse embeddings must be learned or pretrained for good results.
Summary:
Embedding size must match model config.
Random embeddings will decrease performance.
Always use learned or pretrained embeddings for best results.
Solution provided by Triskel Data Deterministic AI.
1 Like