For people finding this thread at a later time. This is an example of how a request (for Llama 3 8B) might look like:
{
"inputs": "<|begin_of_text|><|start_header_id|>user<|end_header_id|>How are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>",
"parameters": {
"max_new_tokens": 1000
}
}