HuggingFace Inference endpoint 504 error

Hello! I tried to set up a inference endpoint for a private model to do a transcription with diarization of mp3 files. It worked with files of 15 min when called using the python code to do requests. But today it does not work anymore for the same files, and return error 504. With a smaller file (2min of duration) the request returns the correct answer. When I am lookin at the logs on the endpoint it seems that the whole pipeline is running. Do you have any idea how to fix that?

Hi @jccscop, Thanks for reporting - we’ve just applied a fix and this should now be working as expected. Please let us know though if you continue to see an error. Thanks again!

Hello! Thanks for your help! It works now for the file of 15 minutes but it seems to fail for files of 40 minutes of duration. Now when running the code with python requests , the code hangs forever whereas the logs of the endpoint show that the code went to the end and did the post request. The duration of the run are not too long, it takes 2min for the 15 min audio file and 4-5 min for the 40 minutes audio files usually.

Hi @jccscop, Thanks for reporting. We’ve applied a fix though please let us know if you continue to see an issue. Thanks!