How to use hugging face to fine-tune ollama's local model

lihuacat · May 10, 2024, 1:59pm

I am a newbie. I have downloaded ollama and can run gemma:2b on my laptop.
model, I want to fine-tune this model, but I did not find the gguf file under C:/Users/lihuacat/.ollama/models. I found one under C:/Users/lihuacat/.ollama/models/manifests/registry. .ollama.ai/library/gemma/2b, but when writing the file, an error that the file cannot be found will be reported.
C:/Users/lihuacat/.ollama/models/blobs/sha256-c1864a5eb19305c40519da12cc543519e48a0697ecd30e15d5ac228644957d12 This file, but when writing the file name, an error that the config file cannot be found appears. Is it true that ollama can only be used as a proxy? The model file needs to be downloaded from hugging face. Can?

lihuacat · May 10, 2024, 2:00pm

OSError: Unable to load weights from pytorch checkpoint file for ‘C:/Users/lihuacat/.ollama/models/manifests/registry.ollama.ai/library/gemma/2b’ at ‘C:/Users/lihuacat/.ollama/models/manifests/registry.ollama.ai/library/gemma/2b’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

lihuacat · May 10, 2024, 2:01pm

nielsr · May 10, 2024, 2:29pm

Hi,

File formats like GGUF are typically meant for inference on local hardware, see ggml/docs/gguf.md at master · ggerganov/ggml · GitHub.

For fine-tuning models, one typically uses one of the following libraries (in combination with GPU hardware):

Transformers, TRL, PEFT. These are libraries developed by HF making it very easy to fine-tune open-source models on your custom data.
Unsloth: GitHub - unslothai/unsloth: Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory. Claims to fine-tune models faster than the Transformers library.
Axolotl: GitHub - axolotl-ai-cloud/axolotl: Go ahead and axolotl questions. It works with a YAML-based configuration file that defines your entire fine-tuning. See e.g. axolotl/examples/llama-3 at main · axolotl-ai-cloud/axolotl · GitHub for llama-3
Torchtune: https://github.com/pytorch/torchtune: a new library developed by Meta for LLM fine-tuning.

After fine-tuning, the weights can be converted to the GGUF format, which allows local inference with ollama and llama cpp. See How to convert any HuggingFace Model to gguf file format? - GeeksforGeeks for a tutorial.

Additionally there’s Apple’s MLX library which allows to fine-tune LLMs on a Macbook. See mlx-examples/llms at main · ml-explore/mlx-examples · GitHub for fine-tuning LLMs.

lihuacat · May 11, 2024, 1:58am

thank you，now i know how to over my job

JohnShepardNeil · July 1, 2024, 8:39am

所以这个具体什么情况导致的呢?我在vscode调用llama3的时候也出现相关问题,是需要再去hugging face上把分词器文件那些下载了才行吗

Deepadharshini · August 2, 2024, 1:15pm

@lihuacat Did you find any way to fine tune the model which is avail in ollama?

amitkumarhuggingface · August 28, 2024, 3:37am

I am facing lot of issues to fine tune ol
lama models but, i tried with HF models after that i load those models into ollama but i found nonsense anwers.

Topic		Replies	Views
Llama3 Fine-Tuning Consultation Beginners	1	96	February 12, 2025
How to make a model file for Ollama? Models	1	168	April 24, 2025
A Call for Expert Help: Building a Native Windows AI Wrapper to Empower My Students with Learning Disabilities Beginners	2	18	March 24, 2025
Lama 3.23b performs great when I download and use using ollama but when I manually download the model or if I use the gguf model by unsloth, it gives me irrelevant response. Please help me out Beginners	9	1348	October 31, 2024
How to download a model and run it with Ollama locally? Beginners	17	117010	May 15, 2025

How to use hugging face to fine-tune ollama's local model

Related topics