We have this space with a gradio app that was running perfectly last week with the free plan:
And since monday this week we got a Runtime Error saying:
# Runtime error
## Memory limit exceeded (16Gi)
The build logs were fine and the errors comes with the start up of the container. We decided to go for a paid version and after updating the error we have now is:
# Runtime error
## Memory limit exceeded (32Gi)
Please someone from infra, could you give us more information about what is going on here and how to solve this? We need to present a poster in a conference next week where we want to share this dashboard. @radames@michellehbn
Many thanks
Rosa
P.S.: I checked the forum and it was not helpful any similar posts since we have no info about the problem.
The app is working perfectly fine locally so we cannot reproduce the issue. Do you have any other logs of the container that could give us info why the app is working locally but not working in HF?
Best
P.S.: My local machine is a Mac-Pro M3 with 18 Gb RAM and I monitored the execution and could not detect any error.
I recommend you logging the memory on your app and see the memory on logs both locally and on Spaces.
In terms of logs on Spaces, the log tab is all we could access as well.
Let me try running it here locally, I’ll let you know
It would be helpful if someone could kill the container in HF, delete the old image to force building a new one. In my own experience with docker, and other colleagues confirm the same, sometimes rebuilding but without removing the old image does not work.
Many thanks for your help. Looking forward to your own local check
Click this button to trigger a factory rebuild of your Space. This will invalidate Docker layer caches and rebuild your space from scratch, reinstalling all dependencies.
Many thanks @radames May I ask which tool did you use to identify the source of trouble?
I will work on refactoring the app and try again my luck next week
In the end, the issue was in the way we were computing errors in one parquet file. The app was parsing the data in the wrong way. Once that was changed the app is running again. Anyway I detected some performance improvements that can be done and we will be working on that soon.
Many thanks for your helpful support.