API inference limit changed?

Hi,

I had the same experience. I had been using LLaMa-3.3-70B for several months through a PRO subscription. I compare summarization results on news stories (40-70 stories/day, 700-2,500 tokens) for different models/APIs each day, GPT-40, Gemini, LLaMa-3.3-70B, etc.

When I got rate-limited, I opened a second acount to see what the “shadow charge” on PRO users was. Over two days, I used the $2 credit after around 80 stories. The equivalent charge from OpenAI was ~$0.40.

Is the coming PRO “pay as you go” likely to be that high?

Thanks, -Charlie Dolan

PS Really miss using LLaMa-3.3-70B because it was very often right on the mark summariizing long discursive news analysis and blog posts when GPT-4o, Sonnet 3.5 and Gemini 2.0 all wiffed.

1 Like