I read through the blog for Gemma 3, and created a python script based on the Shakespeare example.
The example from the blog works for creative writing tasks, where some rambling is acceptable, but this generates random (and often repetitive) unwanted output when given a technical task such as writing SQL.
To clarify, the script successfully answers the question, but then starts producing unwanted output until max_new_tokens is reached.
The code below shows my latest attempt to reduce the unwanted output using a repetition penalty.
`import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, TextIteratorStreamer, set_seed
from threading import Thread
set_seed(1)
ckpt = r"E:\Work\LLM\Laptop\gemma\gemma3_1b_it_model"
model = AutoModelForCausalLM.from_pretrained(
ckpt, torch_dtype=torch.bfloat16, device_map="cpu"
)
tokenizer = AutoTokenizer.from_pretrained(ckpt)
messages = [
[
{
"role": "system",
"content": [{"type": "text", "text": f"You are a helpful assistant who is an expert in ANSI SQL"},]
},
{
"role": "user",
"content": [{"type": "text", "text": f"Write an SQL query to return all rows from the table called sales. Do not respond with anything other than the SQL query."},]
},
],
]
inputs = tokenizer.apply_chat_template(
messages, add_generation_prompt=False, tokenize=True,
return_dict=True, return_tensors="pt"
).to("cpu")
streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=False)
thread = Thread(target=model.generate,
kwargs={"input_ids": inputs['input_ids'],
"streamer": streamer, "max_new_tokens": 64, "repetition_penalty": 0.8, "do_sample": False})
thread.start()
for new_text in streamer:
print(new_text, end="")
thread.join()`
This has helped a lot, but it still prints a random word or two after its already generated the correct response.
If the stop_strings parameter could be used on end_of_turn that could potentially resolve the issue, but I haven’t discovered a way to make that work. Any other ideas would be greatly appreciated!
For example:
python sql.py
SELECT *
FROM sales;
```<end_of_turn><end_of_turn><end_of_turn> Alley
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>
<end_of_turn>