Handling Extreme Class Imbalance for Multi-Class Classification

It may be difficult to use the embedding model alone. Perhaps you will have to divide it into several stages or use a larger model…
https://datascience.stackexchange.com/questions/71558/text-classification-into-thousands-of-classes

https://www.mdpi.com/2079-9292/13/7/1199