HF Inference Endpoints Difference between Max Input Length per Query and Max Token Length per Query

fast-digital-office · August 9, 2024, 9:49am

Can anyone explain the difference?

fast-digital-office · September 23, 2024, 1:12pm

Found it – these are explained in more detail in the TGI documentation here

system · September 24, 2024, 1:12am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Model configuration in new inference endpoints page Inference Endpoints on the Hub	0	324	January 12, 2024
Is there an response length limit for the inference API? Inference Endpoints on the Hub	0	444	March 28, 2024
Serverless Inference API Token Limits/Settings Beginners	2	158	November 26, 2024
Accelerated Inference API not taking parameters? Intermediate	5	1634	October 26, 2022
Inference API - Response of Higher Length Beginners	0	849	April 22, 2021