Avoiding the usage of HfApiModel and using local model - `smolagents`

ArseniyPerchik · April 28, 2025, 7:30pm

I try to learn the basics of smolagents and I got the following big problem - please help!
I am getting the message that I have run out of the free tier for HfApiModel, and I need to buy the paid tier.
How can I use the local model to run with my CodeAgent in smolagents?

John6666 · April 29, 2025, 12:30am

There are several methods, but you can use TransformersModel instead of HfApiModel and it will work. When using large models, powerful GPUs are required, so please be careful. Well, I think SmolLM below will work with about 1GB of VRAM…

Also, Ollama is faster and uses less VRAM, but I think it will be a little difficult to set up (compared to TransformersModel…).

from smolagents import TransformersModel

model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")

ArseniyPerchik · April 29, 2025, 9:16am

I tried to run the following code:

from smolagents import TransformersModel
from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel

model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
agent = CodeAgent(tools=[], model=model, additional_authorized_imports=['datetime'])
agent.run(
    """
    Alfred needs to prepare for the party. Here are the tasks:
    1. Prepare the drinks - 30 minutes
    2. Decorate the mansion - 60 minutes
    3. Set up the menu - 45 minutes
    4. Prepare the music and playlist - 45 minutes

    If we start right now, at what time will the party be ready?
    """
)

I use my Mac M3’s CPU. And it does not produce responses. There is sertanly a bug here in code. Maybe the model is too big, and yes, I don’t have any GPU here at my place. Is the code fine?
Another question, let’s say I will buy the PRO subscription. Can I use HuggingFace API also for LLamaIndex and LangGraph? I think, I saw that the API will work with LLamaIndex, so the bigger question - is the API suitable with LangGraph?
Thank you in advance

John6666 · April 29, 2025, 10:45am

from smolagents import CodeAgent, TransformersModel

model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
agent = CodeAgent(tools=[], model=model, additional_authorized_imports=['datetime'],  add_base_tools=True)
agent.run(
    """
    Alfred needs to prepare for the party. Here are the tasks:
    1. Prepare the drinks - 30 minutes
    2. Decorate the mansion - 60 minutes
    3. Set up the menu - 45 minutes
    4. Prepare the music and playlist - 45 minutes

    If we start right now, at what time will the party be ready?
    """
)

I think the code is correct, but it seems that the name of HfApiModel has changed (mainly the name) in the latest version of smolagents, so I made some minor corrections there.

Also, it’s a bit slow on the CPU… but it should still work. It’s just incredibly slow.
On a Mac, using MLX or Ollama might speed things up a bit…

ArseniyPerchik · April 29, 2025, 2:12pm

Uff ok, CodeAgent is not good with “HuggingFaceTB/SmolLM-135M-Instruct” model. As far as I understand, the reason is that the context window is too small for the default huge prompt of the CodeAgent. So I tried to use another kind of agent - ToolCallingAgent:

from smolagents import DuckDuckGoSearchTool, ToolCallingAgent
from smolagents import MLXModel

model = MLXModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
agent = ToolCallingAgent(tools=[DuckDuckGoSearchTool()], model=model)

agent.run("Search for the best music recommendations for a party at the Wayne's mansion.")

agent.run(
    """
    what color is the sky?
    """
)

Here, I’ve also inserted the model within the MLXModel class, which works well on my Mac, at least.

And this is the output:

There is a need to adapt, but I am not sure what exactly. The prompt does not help the model? idk. Does the code run in your environment by any chance?..

John6666 · April 29, 2025, 2:19pm

My local environment is Python 3.9, so smolagents doesn’t work locally…
Anyway, I forgot about that bug, or rather, that feature…

github.com/huggingface/smolagents

[BUG] Running with `TransformersModel` does not work

opened 11:48AM - 29 Jan 25 UTC

closed 05:35PM - 13 Feb 25 UTC

danielkorat

bug

**Describe the bug** When replacing `HfApiModel` with `TransformersModel` in `ex…amples/benchmark.ipynb`, the eval results for `meta-llama/Llama-3.1-8B-Instruct` (and various other published models) are far worse than published (scores of less than 5). **Code to reproduce the error** https://github.com/danielkorat/smolagents/blob/transformers/examples/benchmark-transformers.ipynb **Error logs (if any)** Seems like a big part of problem is the parsing of the LLM output (specifically the assistant role): ![Image](https://github.com/user-attachments/assets/689b600f-6c62-493c-8148-531d52c162f9) Also, the regex parsing error arises in nearly all examples. **Expected behavior** Trying to reproduce the results for `meta-llama/Llama-3.1-8B-Instruct`, as published in the original notebook: ![Image](https://github.com/user-attachments/assets/110dceb8-0383-47cd-ad4e-bf278ccf5362) **Packages version:** ```python >>> smolagents.__version__ '1.5.0.dev' ``` **Additional context** Add any other context about the problem here. ```bash accelerate==1.3.0 datasets==3.1.0 matplotlib==3.10.0 matplotlib-inline==0.1.7 numpy==1.26.4 seaborn==0.13.2 sentence-transformers==3.3.0 sympy==1.13.1 transformers==4.48.1 ```

from smolagents import DuckDuckGoSearchTool, ToolCallingAgent
from smolagents import MLXModel

model = MLXModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct", max_new_tokens=4096)
agent = ToolCallingAgent(tools=[DuckDuckGoSearchTool()], model=model)

agent.run("Search for the best music recommendations for a party at the Wayne's mansion.")

agent.run(
    """
    what color is the sky?
    """
)

John6666 · May 2, 2025, 8:16pm

Good code example for using Ollama with smolagents.

pahenn · May 2, 2025, 9:00pm

I just posted another solution in the course category, check it out here. This way you’re not tied only to local inference if you don’t have a beefy machine.

Alternative options for API endpoints

Topic		Replies	Views
How to run agents from `smolagents` locally? Inference Endpoints on the Hub	4	497	May 27, 2025
I ran the notebook Building Agents That Use Code and it says I do not have free inference time Course	1	34	May 21, 2025
How to deploy smolagents locally Beginners	2	97	April 28, 2025
Not clear about how to run the agent locally Beginners	1	51	May 6, 2025
Qlora adapter with smolagents Beginners	1	46	February 9, 2025

Avoiding the usage of HfApiModel and using local model - `smolagents`

Related topics