Inference API works for flan-t5-xxl, but not for many other models I have tried with Jupyter/VSCode

HI, I am trying to access the inference API from VSCode/Jupyter and have been successful with just the flan-t5-xxl and not with other models like blenderbot, flan-t5-xl…

Not sure what is happening. I do have a pro subscription which didn’t make a difference

Any help would be awesome. Using Langchain

************ Works ******************
flan_t5 = HuggingFaceHub(
repo_id=“google/flan-t5-xxl”,
model_kwargs={“temperature”:0.2 }
)

********* Does not work ***********
flan_ul2 = HuggingFaceHub(
repo_id=“google/flan-ul2”,
model_kwargs={“temperature”:0.1,
“max_new_tokens”:256})

conversation = ConversationChain(
# llm=flan_ul2,
llm=flan20b,
verbose=True,
memory=memory
)

---- just goes into an endless loop – and times out