Inference Model with API and Integrate to LM (Language Model)

I already following @patrickvonplaten article article that made a n-grams model from our dataset and i already success to inference with it on local model. But i had a problem with inference time on my cpu device on local, so i just push my model on huggingface my-model and decided to inference via API, with this way :

headers = {"Authorization": f"Bearer {API_TOKEN}"}
API_URL = "https://api-inference.huggingface.co/models/ridhoalattqas/xlrs-best-lm"

def query(audio_bytes):
    response = requests.request("POST", API_URL, headers=headers, data=audio_bytes)
    return json.loads(response.content.decode("utf-8"))

but i facing another problem that the language_model that i already pushed on my-model is not linked when i do inference.

Do you guys any suggestion to do that API inference because i had a plan to subscribe for better limitation if its works

Thank you