You can see Is there a timeout (max runtime) for spaces? - #2 by Epoching or the Gradio docs. You need to set enable_queue
to True
for longer inference
- enable_queue (bool) - if True, inference requests will be served through a queue instead of with parallel threads. Required for longer inference times (> 1min) to prevent timeout.