A very basic Hugging Face LLM API access

Hi! :wave:

I am trying to understand how LLMs works.

I created a very basic Hugging Face LLM API access with this simple HTML code.

Seems it is working but the model is returning a error JSON like this:

{
“type”: “status”,
“endpoint”: “/model_chat”,
“fn_index”: 0,
“time”: “2024-07-14T11:03:31.191Z”,
“queue”: true,
“message”: null,
“stage”: “error”,
“success”: false
}

The HTML simple code:

<script type="module">
    import { Client } from "https://cdn.jsdelivr.net/npm/@gradio/client@latest/dist/index.js";
    
    async function runGradioClient() {
        const client = await Client.connect("Qwen/Qwen2-72B-Instruct");
        const result = await client.predict("/model_chat", {
            query: "Hello!!",
            history: [["Hello!", null]],
            system: "Hi there!",
        });

        console.log(result.data);
    }

    runGradioClient();
</script>

I am very grateful for any clue. Thanks!