Accelerated Inference for gpt-j using javascript

Hi. This is my first post on the forum. I am interested in the transformers topic, and specifically Eleuther’s GPT-J.

I am trying to use Accelerated Inference. I have created an api key, but I suspect that I need to somehow associate the api key with the Eleuther GPT-J model. Either that or there’s something else in my code that does not work with the model I’m interested in. Possibly the model is not working right on the server side.

I created a project, and doing so got an api key via the web site, but I think it’s not associated with the right model. I am only signed up for the free teer right now. If the model works out, i’d be interested in a higher teer.

Right now, using javascript, I am able to query the model. I am returned what seems to be a single token or two. The response is fast, and that’s encouraging, but I cannot get more than that one or two tokens. I am using the api key as the Bearer token in the Post request. My code is on the messy side, but a link is below to the file where I use fetch. My fetch calls are similar to the javascript example on the Huggingface model’s page.

Thank you for your time.

TLDR: I’m only getting one token with each request. Does anyone have Javascript that works in this situation?