Flan-t5-xl generates only one sentence

I’ve been playing around with Flan-t5-xl on huggingface, and for the given prompt:
“Q: Generate 10 diverse questions asking about the current weather A:”

It generates only one sentence everytime “What is the weather like today?”

I want my prompt to generate 10 questions, each differing from other in style and tone. Is it possible to obtain such results using FLAN-T5-xl model?

Just to start, if you set min_length to a larger value when you call generate, you can force it to generate more questions and that’ll help.

Beyond that, it seems like this task is probably doable with flan-t5-xl, but I think you’ll need to spend some time experimenting with different prompts and generation args (temperature, repetition penalty, etc) to get it to work well.

Here’s a script with a few example prompts I tried. None of them work flawlessly, but the last two seem closest to what you’re looking for.

from transformers import T5ForConditionalGeneration, T5Tokenizer, set_seed

set_seed(42)

device = "cuda"
path = "google/flan-t5-xl"
tokenizer = T5Tokenizer.from_pretrained(path)
inputs = [
    "Q: Generate 10 diverse questions asking about the current weather. A:",
    "Q: Generate 10 diverse questions asking about the current weather. Do not answer the questions, only ask them. A:",
    "Complete a sequence of 12 unique questions about the weather. Do not answer the questions, only ask them. Now complete the sequence: #12 What's the current temperature? #11 Is it humid outside today? #10",
    "Write 10 diverse questions about the current weather. Do not answer the questions, only ask them. Count up to 10 while asking the questions. Write the current number before each question. For example: #1 What's the weather like outside today? #2 What is the humidity level?",
]
inputs = tokenizer(inputs, return_tensors="pt", padding=True, truncation=True).to(device)

model = T5ForConditionalGeneration.from_pretrained(path).to(device)
sequences = model.generate(
    **inputs, do_sample=True, min_length=100, max_length=300, temperature=0.97, repetition_penalty=1.2
)

decoded = tokenizer.batch_decode(sequences, skip_special_tokens=True)
for out in decoded:
    print("-" * 80)
    print(out)

I get these outputs:

--------------------------------------------------------------------------------
How does one tell if it is raining? If a person can see precipitation from across the street. Which statement is correct? Which statement is correct? Which statement is correct? Which statement is correct? Which statement is correct? Which weather indicator determines precipitation? Which statement is incorrect? Which statement is correct? Which statement is incorrect? Which statement is incorrect? Which weather indicator determines high temperatures? Which statement is correct? Which statement is incorrect? What temperature would you say has been seen in Atlanta, Georgia so far today?
--------------------------------------------------------------------------------
How is the weather this morning? What would you say has the most to do with the present weather? How would you describe the weather this morning? How would you describe the weather this afternoon? How would you describe the weather for this evening? Would you say there is rainy/cold weather? How do you think the weather is going to go this evening? What are you going to do about the weather today? What are some things that will occur this afternoon? Why or why not?
--------------------------------------------------------------------------------
What kind of rain are we getting today? #5 So what is the forecast for the next 3 days? #8 In your opinion, is it worth going out this weekend? #8 What does the weather look like for Friday, April 13th? #7 Will the weather be good for drinking Friday? #8 Where will you be going in Friday? #9 What type of weather is it on TV? #6 How many people are there? #5 What is the rain falling like? #7 Was it any different Wednesday? #8 Who are the people passing by in their vehicles? #7 How are the local roads being kept? #8 Is anyone on the road? #7 Were the cars and trucks stopped?
--------------------------------------------------------------------------------
#1 What's the weather like outside today? #2 What is the humidity level? #3 What is the wind speed? #4 What is the low temperature? #5 What is the temperature? #6 What is the air temperature? #7 What is the wind speed? #8 What is the humidity level? #9 What is the air temperature? #10 What is the high temperature? #11 What is the low temperature? #12 What is the humidity level? #13 What is the wind speed? #14 What is the humidity level?

The last two are on the right track, but obviously have issues where the questions aren’t unique and some of the questions veer off-topic.

If you can make a larger model work, you can try flan-t5-xxl or flan-ul2 instead. For example, I tried flan-ul2 and it works much better, but it’s a 20B model so running it is a lot more resource-intensive. Here are the outputs I got with the same code as above:

--------------------------------------------------------------------------------
How does the weather appear today? Select on the following. a. Somewhere in between rainy and sunny. b. A mix of cloudy and dry. c. Overcast with scattered storms. d. Rainy with light showers predicted. Answer: B The weather is sunny. It is warm and windy. The clouds have passed by for now. No rain has been seen yet. The breeze is steady. All is peaceful. What time did the current weather pattern begin?
--------------------------------------------------------------------------------
How is the weather this week? What is the forecast? Would it be better to travel this week? How long will the polar vortex last? What’s the possibility of flooding? Is rain forecast? What is the current temperature? How cold is it? What time does it really get cold? How much are temperatures normally in December? Does it usually snow? How much snow would need to fall to impact a city? Why do people go skiing? What are the conditions?
--------------------------------------------------------------------------------
Is it raining outside? #9 Is it stormy outside today? #8 Is it windy outside today? #7 Is it sunny outside today? #6 Is it cloudy outside today? #5 Is there a chance of precipitation/shower? #4 Is the temperature in Fahrenheit or Celsius? #3 Is the humidity in percent? #2 Is it windy outside today? #1 Is it humid outside today? The series is presented to you by Microsoft Virtual Academy. Check out our free online training.
--------------------------------------------------------------------------------
#1 Is it hot outside? #2 Is there a thunderstorm? #3 What's the temperature today? #4 Will I need to wear long pants? #5 Do I need shorts? #6 Should I put on a sweater? #7 What was yesterday's wind speed? #8 Is there rain or snow? #9 How close is it to midnight? #10 Will I have to drive today? In just a few seconds, your weather will be done... Good job!

In this case, the last 2 sets of outputs seem pretty close to what you’re looking for but not quite. With more “prompt engineering” and tinkering with the generation params, you could probably get it to work well. You can also use bf16, quantization, etc to make flan-ul2 or flan-t5-xxl less resource-intensive if you choose to use a larger model.

I’d also consider trying out one of the Llama models (e.g., the Alpaca variant) and of course, ChatGPT could probably do this flawlessly.

1 Like

Thanks for your prompt response @dblakely! I’ll definitely play around with those and get back to you with my understandings.

One question that I would like to ask is what is the minimum compute requirements to run flan-ul2? More specific, is it possible to run this model on Macbook Pro M1Pro chip?

Unfortunately, I don’t have a great answer for you - I imagine it’s possible but will be very slow.

Though if you need to run a large model on your MBP, one of the LLaMa-based models would probably be easier to work with. For example, Alpaca-Lora and GPT4All are supposed to be able to run performantly on a laptop. These are also instruction-tuned models, so they should do tasks like what you mentioned in this thread.

Or of course stick with one of the smaller T5-based models (flan-t5-xl, flan-t5-xxl) or see if you can get access to a GPU.