Avoiding the usage of HfApiModel and using local model - `smolagents`

John6666 · April 29, 2025, 12:30am

There are several methods, but you can use TransformersModel instead of HfApiModel and it will work. When using large models, powerful GPUs are required, so please be careful. Well, I think SmolLM below will work with about 1GB of VRAM…

Also, Ollama is faster and uses less VRAM, but I think it will be a little difficult to set up (compared to TransformersModel…).

from smolagents import TransformersModel

model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")

Topic		Replies	Views
Not clear about how to run the agent locally Beginners	1	61	May 6, 2025
How to run agents from `smolagents` locally? Inference Endpoints on the Hub	4	803	May 27, 2025
I ran the notebook Building Agents That Use Code and it says I do not have free inference time Course	1	39	May 21, 2025
How to deploy smolagents locally Beginners	2	137	April 28, 2025
Chapter 2 questions Course	99	9475	July 24, 2025

Avoiding the usage of HfApiModel and using local model - `smolagents`

Related topics