Inference API works for flan-t5-xxl, but not for many other models I have tried with Jupyter/VSCode

gaurav8936 · June 15, 2023, 6:56am

HI, I am trying to access the inference API from VSCode/Jupyter and have been successful with just the flan-t5-xxl and not with other models like blenderbot, flan-t5-xl…

Not sure what is happening. I do have a pro subscription which didn’t make a difference

Any help would be awesome. Using Langchain

************ Works ******************
flan_t5 = HuggingFaceHub(
repo_id=“google/flan-t5-xxl”,
model_kwargs={“temperature”:0.2 }
)

********* Does not work ***********
flan_ul2 = HuggingFaceHub(
repo_id=“google/flan-ul2”,
model_kwargs={“temperature”:0.1,
“max_new_tokens”:256})

conversation = ConversationChain(
# llm=flan_ul2,
llm=flan20b,
verbose=True,
memory=memory
)

---- just goes into an endless loop – and times out

Topic		Replies	Views
Getting Not Found for model google/flan-t5-small (and others) Beginners	10	240	July 7, 2025
Issues regarding using model google t-5 large 🤗Datasets	1	203	June 24, 2024
Inference API model timeout (Flan-UL2) Inference Endpoints on the Hub	1	885	May 26, 2023
Impossible to use flan-t5-xxl in a batch-transform job Amazon SageMaker	3	1149	May 23, 2023
401 Error: Invalid Credentials in Authorization Header (Flan-T5-Large Inference API) Beginners	7	1138	February 28, 2025

Inference API works for flan-t5-xxl, but not for many other models I have tried with Jupyter/VSCode

Related topics