Llama 3 70b in the Chat UI Is Super Slow and Nearly Unusable

I’ve been using Llama 3 since you guys put it up. Within the last week or two, it has been unusable. Sometimes I can get an answer, but most times I’m not going to sit there and wait for the response. Just a heads up. Is there a place to open tickets or a forum topic about huggingface chat?

3 Likes

I’m running linux firefox on a cortex a72 board. Browser can become unresponsive (unusable) for 30 seconds to a minute after generation appears to be complete. Cpu usage pegged. What unholy code is running here?

1 Like

Try Inference Playground. HF official’s so probably fast.