Hi HuggingFace Community, I have been experimenting with the LLaMA-3:8B model using the following code: import transformers import torch model_id = "meta-llama/Meta-Llama-3-8B" pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, …

How to Configure LLaMA-3:8B on HuggingFace to Generate Responses Similar to Ollama?

John6666 October 4, 2024, 2:28am 2

The issue of different output when using pipelines and when using them locally or in Spaces is sometimes talked about.
While pipelines are easy to deal with, they seem to be working by supplementing a lot of information on their own, which means it is difficult to determine what is wrong with them.

I’m not sure about Ollama, but I think it would be better to use something other than pipelines to set chatbot-like parameters.
chat_completion, for example, is often used.
Also, I heard that Llama3 has some kind of bug, though I’m not sure what it has to do with this issue.

1 Like

Topic		Replies	Views
Lama 3.23b performs great when I download and use using ollama but when I manually download the model or if I use the gguf model by unsloth, it gives me irrelevant response. Please help me out Beginners	9	1349	October 31, 2024
HuggingFace API Beginners	0	159	July 31, 2024
meta-llama/Llama-3.2-11B-Vision-Instruct did not reply 🤗Transformers	10	12908	October 29, 2024
Getting started Beginners	1	283	December 10, 2024
How do I download llama-2 Beginners	1	15403	September 8, 2023

How to Configure LLaMA-3:8B on HuggingFace to Generate Responses Similar to Ollama?

Related topics