Avoiding the usage of HfApiModel and using local model - `smolagents`

There are several methods, but you can use TransformersModel instead of HfApiModel and it will work. When using large models, powerful GPUs are required, so please be careful. Well, I think SmolLM below will work with about 1GB of VRAM…

Also, Ollama is faster and uses less VRAM, but I think it will be a little difficult to set up (compared to TransformersModel…).

from smolagents import TransformersModel

model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")