LLAMA2 70b Inference api stuck on currently loading

Up until till morning, I was using the inference APIs for llama-2-70b-chat-hf model , and now I only get the following error repetatedly:
{'error': 'Model meta-llama/Llama-2-70b-chat-hf is currently loading', 'estimated_time': 5518.13232421875}
The estimated time does not change as this error keeps on coming. I even tried periodically with a gap of few hours throughout the day, but still with no progress. When I run llama2-7b and llama2-17b its working fine, but for my research project I have to use 70b llama necessarily. Is anyone else facing this problem. Any help will be highly appreciated.


P.S. I have bought the Pro membership of HF

1 Like

I have the same issue.

@prapti19 the service is back :slight_smile:

Thanks for the update @eboraks !