Hi,
i am totally new in using Hugging Face’s transformer library.
To start i created this code with the help of chatGPT:
import torch
from transformers import pipeline
from huggingface_hub import login
hf_token = "[mytoken[" #I have already created one and it works
login(token=hf_token) #works
generator = pipeline("text-generation", model="meta-llama/Llama-3.2-1B")
question = "What is the capital of Germany?"
response = generator(
question,
max_length=50,
num_return_sequences=5,
temperature=0.5, # Niedrigere Temperatur für genauere Antworten
top_k=100, # Höhere Top-k-Werte
top_p=0.85, # Niedrigere Top-p-Werte
do_sample=True,
truncation=True
)
print(response)
This is my response:
[{'generated_text': 'What is the capital of Germany? A. Savannah B. Augusta C. Berlin D. Atlanta\nAnswer: A\nExplanation: Savannah is the capital of Georgia.'}, {'generated_text': 'What is the capital of Germany? A. Columbus, Ohio B. Tallahassee, Florida C. Atlanta, Georgia D. Raleigh, North Carolina\nWhat is the capital of Germany? A. Columbus, Ohio B. Tallahassee,'}, {'generated_text': 'What is the capital of Germany? A. Nuremberg B. Atlanta C. Tallahassee D. Columbus\nAnswer: A\nExplanation: Nuremberg is the capital of Germany.'}, {'generated_text': 'What is the capital of Germany? A. Frankfurt B. Atlanta C. Columbus D. Boston\nA. Frankfurt\nB. Atlanta\nC. Columbus\nD. Boston\nAnswer: A\nExplanation: Frankfurt is the capital of Germany'}, {'generated_text': 'What is the capital of Germany? A. Berlin B. Munich C. Frankfurt D. Stuttgart\nA. Berlin\nB. Munich\nC. Frankfurt\nD. Stuttgart\nAnswer: A'}]
I excepted a simple answer to a simple question.
What am i doing wrong?
Is it a bad model i use?
Or is the “text-generation” pipeline not the correct to use for such cases?
Thank you for your support.
Best wishes
Daniel
1 Like
Why don’t you change it like this?
response = generator(
question,
max_length=50,
temperature=0.5, # Niedrigere Temperatur für genauere Antworten
top_k=100, # Höhere Top-k-Werte
top_p=0.85, # Niedrigere Top-p-Werte
do_sample=True,
return_dict_in_generate=False,
truncation=True
)
Hi,
i changed but get another not satisfying answer 
[{‘generated_text’: 'What is the capital of Germany? A. Columbus, Georgia B. Atlanta C. New York D. Denver\nA. Columbus, Georgia\nB. Atlanta\nC. New York\nD. Denver\nAnswer: A\nExplanation: '}]
Why does the Answer always begin with the question itself?
The capital is Berlin which is not included.
I think the used model “meta-llama/Llama-3.2-1B” should know the capital, shouldn’t it?
Well, it’s a long explanation, but when you call a pipeline, various functions are combined and invoked within it on their own.
It would be too much trouble to call them all manually!
The standard setup is so clunky that, well, people probably won’t have trouble with it or so.
And if you’re not satisfied with the normal output, you’ll have to tweak the options, how about this?
I’ve tried to eliminate ambiguity and not include questions.
response = generator(
question,
max_length=50,
do_sample=False,
return_full_text=False,
)
Hi John,
i changed it as you suggested.
Now the output is:
[{‘generated_text’: ’ A. Frankfurt B. Columbus C. Atlanta D. Tallahassee\nWhat is the capital of Germany?\nA. Frankfurt\nB. Columbus\nC. Atlanta\nD. Tallahassee\nAnswer:'}]
I see: There is a lot work / learning to get good results.
I will start…
1 Like
Ah. I tried to be returned only the added text, but it’s returning only the questions in reverse.
Also, maybe it’s too strict and the LLM is having trouble answering.
How about a simple one? It means maybe temperature=0.7 and do_sample=True.
response = generator(
question,
)
By the way, I think they return more than the required number of responses, because it is easy to handle it by manipulating lists, dictionaries, and strings in Python. If you want to do it, I can show you how to do it.
But I think some models only return answers originally, but I don’t know if there is a setting somewhere. With pipelines, it’s hard to tell where it’s affecting the results…
Maybe you could try a different model once. That would tell us if the program is bad or if the model settings are weird.