504 Timeout when using flan-ul2

I constently got this error when I’m trying to use flan-ul2:
"An error occurred: 504 Server Error: Gateway Timeout for url: https://api-inference.huggingface.co/models/google/flan-ul2 (Request ID: ZpqtQA6VFVaMqfN6hJmKe)

Model google/flan-ul2 time out"

Can anyone help with this?
My code is :import os
from langchain.llms import HuggingFaceHub
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import ConversationBufferMemory

Hugging Face API token

os.environ[“HUGGINGFACEHUB_API_TOKEN”] = “”

flan_ul2 = HuggingFaceHub(
repo_id=“google/flan-ul2”,
model_kwargs={“temperature”:0.5, “max_new_tokens”:128}
)

创建内存管理器

memory = ConversationBufferMemory()

创建对话链

conversation = ConversationChain(
llm=flan_ul2,
verbose=True,
memory=memory
)

try:
response = conversation.predict(input=“Hi there! I am Sam”)
print(response)

response = conversation.predict(input="How are you today?")
print(response)

response = conversation.predict(input="Can you help me with some customer support?")
print(response)

response = conversation.predict(input="My TV is broken. Can you help fix it?")
print(response)

except Exception as e:
print(f"An error occurred: {e}")