Hosted inference API - Limit output, text classification

Hey there,
I am currently creating a model for text classification.
This model has around 30 000 classes.

Currently the hosted inference API breaks my model card because it tries to load all 30 000 labels (even if they are mostly 0). Is there a possibility to limit the output of the API to the 5 most relevant classes (like in the text-classification pipeline of transformers)?

I already checked the API documentation, and there seems to be no option to control this.

Thanks for your help!

Greetings Philipp

ps. I am talking about this model