Beginner question

basementmedia · October 4, 2024, 7:01pm

Hi,

i am totally new in using Hugging Face’s transformer library.

To start i created this code with the help of chatGPT:

import torch
from transformers import pipeline
from huggingface_hub import login

hf_token = "[mytoken[" #I have already created one and it works
login(token=hf_token) #works

generator = pipeline("text-generation", model="meta-llama/Llama-3.2-1B")

question = "What is the capital of Germany?"

response = generator(
    question,
    max_length=50,
    num_return_sequences=5,
    temperature=0.5,  # Niedrigere Temperatur für genauere Antworten
    top_k=100,        # Höhere Top-k-Werte
    top_p=0.85,       # Niedrigere Top-p-Werte
    do_sample=True,
    truncation=True
)

print(response)

This is my response:

[{'generated_text': 'What is the capital of Germany? A. Savannah B. Augusta C. Berlin D. Atlanta\nAnswer: A\nExplanation: Savannah is the capital of Georgia.'}, {'generated_text': 'What is the capital of Germany? A. Columbus, Ohio B. Tallahassee, Florida C. Atlanta, Georgia D. Raleigh, North Carolina\nWhat is the capital of Germany? A. Columbus, Ohio B. Tallahassee,'}, {'generated_text': 'What is the capital of Germany? A. Nuremberg B. Atlanta C. Tallahassee D. Columbus\nAnswer: A\nExplanation: Nuremberg is the capital of Germany.'}, {'generated_text': 'What is the capital of Germany? A. Frankfurt B. Atlanta C. Columbus D. Boston\nA. Frankfurt\nB. Atlanta\nC. Columbus\nD. Boston\nAnswer: A\nExplanation: Frankfurt is the capital of Germany'}, {'generated_text': 'What is the capital of Germany? A. Berlin B. Munich C. Frankfurt D. Stuttgart\nA. Berlin\nB. Munich\nC. Frankfurt\nD. Stuttgart\nAnswer: A'}]

I excepted a simple answer to a simple question.

What am i doing wrong?
Is it a bad model i use?
Or is the “text-generation” pipeline not the correct to use for such cases?

Thank you for your support.

Best wishes
Daniel

John6666 · October 4, 2024, 9:34pm

Why don’t you change it like this?

response = generator(
    question,
    max_length=50,
    temperature=0.5,  # Niedrigere Temperatur für genauere Antworten
    top_k=100,        # Höhere Top-k-Werte
    top_p=0.85,       # Niedrigere Top-p-Werte
    do_sample=True,
    return_dict_in_generate=False,
    truncation=True
)

basementmedia · October 5, 2024, 10:30am

Hi,

i changed but get another not satisfying answer

[{‘generated_text’: 'What is the capital of Germany? A. Columbus, Georgia B. Atlanta C. New York D. Denver\nA. Columbus, Georgia\nB. Atlanta\nC. New York\nD. Denver\nAnswer: A\nExplanation: '}]

Why does the Answer always begin with the question itself?
The capital is Berlin which is not included.

I think the used model “meta-llama/Llama-3.2-1B” should know the capital, shouldn’t it?

John6666 · October 5, 2024, 10:59am

Well, it’s a long explanation, but when you call a pipeline, various functions are combined and invoked within it on their own.
It would be too much trouble to call them all manually!

The standard setup is so clunky that, well, people probably won’t have trouble with it or so.
And if you’re not satisfied with the normal output, you’ll have to tweak the options, how about this?
I’ve tried to eliminate ambiguity and not include questions.

response = generator(
    question,
    max_length=50,
    do_sample=False,
    return_full_text=False,
)

basementmedia · October 5, 2024, 11:11am

Hi John,

i changed it as you suggested.
Now the output is:

[{‘generated_text’: ’ A. Frankfurt B. Columbus C. Atlanta D. Tallahassee\nWhat is the capital of Germany?\nA. Frankfurt\nB. Columbus\nC. Atlanta\nD. Tallahassee\nAnswer:'}]

I see: There is a lot work / learning to get good results.
I will start…

John6666 · October 5, 2024, 11:23am

Ah. I tried to be returned only the added text, but it’s returning only the questions in reverse.
Also, maybe it’s too strict and the LLM is having trouble answering.
How about a simple one? It means maybe temperature=0.7 and do_sample=True.

response = generator(
    question,
)

John6666 · October 5, 2024, 11:30am

By the way, I think they return more than the required number of responses, because it is easy to handle it by manipulating lists, dictionaries, and strings in Python. If you want to do it, I can show you how to do it.

But I think some models only return answers originally, but I don’t know if there is a setting somewhere. With pipelines, it’s hard to tell where it’s affecting the results…
Maybe you could try a different model once. That would tell us if the program is bad or if the model settings are weird.

Topic		Replies	Views
Strange answer from api 🤗Transformers	0	624	January 10, 2022
Conversational pipeline by huggingface transformer taking too long to generate output 🤗Transformers	0	843	September 27, 2023
How to generate one word and output it instead of all the answers at once, which would take a long time 🤗Transformers	0	454	August 11, 2023
Generate() returns full prompt plus answer 🤗Transformers	1	6267	February 19, 2024
Short, truncated answers Beginners	3	2667	August 11, 2023

Beginner question

Related topics