Reduce the number of features of BERT embeddings

Hi everyone,
I am using a XXL BERT for my project.
I would like to test the network using an embedding dimension lesser than 768, for example, 300.
I think I could try to perform a PCA on the embeddings.
Is there an implemented solution which does this?

Many thanks in advance

Hi, actually you could use a Dense layer (from sentence-tranformers here ) and go from 768 to 300 with a bit of finetuning.

If you still want to use PCA, huggingface (for what I know) doesn’t have it’s own implementation so I advice you to pick the best python library you know and use that implemlementation.

For example, if you want to use scikit-learn library they have PCA as well as other cool stuff. ( this is the first example that I happen to see on scikit-learn PCA)

1 Like

Is using a Dense layer with a lower feature-space, a legitimate way of dimensionality-reduction ?
Also are PCA and t-SNE good options for dimensionality-reduction of embeddings computed from transformer-based models? I see that they are performed a lot with Word2Vec/ TF.IDF embeddings