Accelerated Inference API not taking parameters?

Hello! I’m trying to generate text using a fine-tuned T5, and I’m running into some truncation issues.

From the docs, I can see that if I send the parameter max_new_tokens I can potentially get longer answers, but the API is responding always in the same length no matter what I do. Here is the payload I’m sending. I also tried wiggling the parameter names and so, the API is validating unknown parameters responding 400, but it keeps truncating the response when I send what it looks like a correct request.

If I use the model in transformers, I get longer responses.

Here is what I’m sending.

const inference_endpoint = "https://api-inference.huggingface.co/models/squidgy/t5-ml-finetuned-gec"

    headers: {
      "Authorization": "Bearer " + process.env.HF_TOKEN,
      "Content-Type": "application/json"
    },
    url: inference_endpoint,
    method: "post",
    data: {
      inputs: query,
      parameters: {
        max_new_tokens: 196,
      },
      options: {
        wait_for_model: await_for_model
      }
    }

Am I doing anything wrong? Thanks in advance for your help!!