Model output is cutoff

smileeok · November 8, 2022, 5:16pm

I am using the Hosted inference API to test out my text-to-text model but my model output is cutoff.

My local model, using the translation pipeline, returns a full sentence with 213 characters, but the model on HuggingFace Hub returns only the first 47 characters.

I also tried the Inference Endpoint, same thing.

Any idea why?

smileeok · November 8, 2022, 10:09pm

I figured out why. locally i was using ‘translation’ as task but on endpoint it’s using ‘text-to-text generation’.

philschmid · November 9, 2022, 9:12am

Are you using the widgets or sending requests? When using the widgets the model are using the default generate arguments, which have a „small“ max_new_length.

You can customize those arguments during inference by adding parameters to your request. For inference endpoints you can find the documentation here: Supported Transformers Tasks

Philipp

smileeok · November 9, 2022, 12:48pm

Yes @philschmid. That fixed it! Thanks!

I was reading this doc earlier: Detailed parameters , and couldn’t find the parameter. It will be great if we can consolidate the parameters and make a single doc.

Bobbybob24 · September 25, 2023, 4:49pm

I am having the same problem with my outputs being cutoff, even if I specify a long minimum output length. how exactly did you change the task locally?

Topic		Replies	Views
Text Generation response truncation Beginners	6	1347	August 18, 2024
Is there an response length limit for the inference API? Inference Endpoints on the Hub	0	444	March 28, 2024
How can I change the max_length of my own model in huggingface inference API? Inference Endpoints on the Hub	0	331	January 5, 2024
Response cutoff Beginners	1	543	December 4, 2023
ClientErro:400 when using batch transformer for inference Amazon SageMaker	11	2220	January 13, 2022

Model output is cutoff

Related topics