Does high number of output labels affect the performance of BERT and how to handle the class imbalance issue while doing multi text classification?

He seems to have gotten the answer itself. It doesn’t seem easy to improve performance…
https://datascience.stackexchange.com/questions/120215/does-high-number-of-output-labels-affect-the-performance-of-bert-and-how-to-hand