I have spun up an Inference Endpoint for llama 3.1 8b instruct, but for some reason when I interact with it through the “playground” on the Inference Endpoint page I get dramatically better responses than when I use the API. Does anyone have any idea why that might be?
I’ve never used the Inference Endpoint API directly, but even before that, I couldn’t find any “playground” that looked like it by googling…
I did find the following article.