- I just posted in the Discord as well, but figured I’d post over here for those who are only checking one or the other.
Hi all, I have been reading a lot of questions around what to do if the examples for use the HfApiModel
fail, or you run out of credits. I was in a similar situation, and went down the path of running locally to begin with using the MLXModel
class and Qwen2.5-Coder-32B, but that was leading to very long waits even with my maxed out M4 Max. So I wanted to share another solution here for anyone in need with lower-end specs or looking for a faster endpoint.
Mistral released Codestral 25.01 in January, and it works great. I switch between using it through Continue.dev in VSCode and Cursor. It really gives Cursor a run for it’s money, and along side that, they created a free, Codestral-specific API endpoint. I believe all you need to do is create an account with their Le Plateforme, and you can get a free api key specifically for Codestral chat and completions
api_key = os.getenv("CODESTRAL_API_KEY")
model = OpenAIServerModel(
model_id="codestral-latest",
api_base="https://codestral.mistral.ai/v1/",
api_key=api_key,
)
agent = CodeAgent(tools=[], model=model, add_base_tools=True)
agent.run(
"Could you give me the 40th number in the Fibonacci sequence?",
)
Very snappy, just wanted to share here for those struggling.