Hi All. I started a 7 day trial for the startup plan. I need to use GPT-J through HF inference API. I pinned it in on org account to work on GPU and, after sending a request, all I get back is a single generated word. The max token param is set to 100. Could you please let me know how should I mak…

How does the GPT-J inference API work?

kaisar123 October 8, 2021, 3:48pm 6

Try using “max_length” parameter instead of “max_new_tokens”. The documentation suggests they serve the same purpose and you should not use both simultaneously. Worked for me.

1 Like

Topic		Replies	Views
Default gpt-j output length Beginners	0	363	April 23, 2022
Change length of GPT-neo output Beginners	6	1882	June 10, 2021
How to return more tokens when calling the inference end point? Inference Endpoints on the Hub	4	1511	May 9, 2024
Unable to generate more than one token at a time using website API Inference Endpoints on the Hub	1	293	November 29, 2023
I am unable to adjust the generated text length Beginners	8	494	September 26, 2024

How does the GPT-J inference API work?

Related topics