I couldn’t find any real info anywhere so I subscribed with PRO to test whether I could use Llama 3.3 70b for my (small) app since the API seemed fine with mistral nemo.
Unfortunately, I get garbage responses about half the time.
Example:
07 every 08: this is low10 an08: this is not a good 07/1.0780780:00 to 1.irectional 07: this is 08:0780780: this is boot1: this is 0780:0000: this is 07:00:0780780780: this is 07: this is 07:00: this is 08: this is 08: this is 1: this is 07: this is 07:00: this is 07: this is 07:0000: this is 07:00: this is 01 i 078: this is 08011: 08:000000:08080780:0000 is 081: 0000: this is 08:0780780: this is 01: this is 07:00:00:0780780780: this is 0780: this is 07:0:10000: this is 01:079079000: this is 07:1:0780:00:00000780: this is 07:0780:0:00:00: this is 07:00: this is 08: this is0780: this is 01: this is 00:00: this is 07:00:00: this is 07:00:078
I’ve tried every single parameter with/without. Tested other models too. Qwen 72b is also broken. Small models work… Again, its not an issue with my code since it works great SOME of the time.
Oh also, a few models like gemma simply straight out never work from the API (model too busy;) although they answer instantly from playground.