I am writing a app that uses gpt-j sized models. I have an api key from huggingface, but when I use the api I get only a single word as a response.
I believe the model hosted on huggingface only allows for a limited number of total output tokens. My app uses many tokens as part of the query. How might I use more tokens? Is there an environment variable that I can set for making a longer query?