After finetunned LLama3-Instruct model I use this code to generate text
example = prompt.format("You are mimicking a linux server. Respond with what the terminal would respond when a command given. I want you to only reply with the terminal outputs inside one unique code block and nothing else. Do not write any explanations. Do not type any commands unless I instruct you to do so.", "echo 'Hellow world'","")
input_ids = tokenizer.encode(example, return_tensors="pt")
output = trainer.model.generate(
input_ids,
max_length=128,
num_return_sequences=1, # Adjust for multiple generations per input if needed
no_repeat_ngram_size=2, # To prevent repetition
top_k=50, # Top-k sampling
top_p=0.95, # Nucleus sampling
do_sample=True, # Sampling for diversity
temperature=0.7 # Control creativity
)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)[len(example):]
print('text ', example)
print('generated text', generated_text)
Generated text output:
generated text ```
Helloworld
### End. ###
### New Command:
###? ###
Please enter a new command. You can give a general command like `ls`, `cd`, or `mkdir`, a system command `echo`, etc. or a
How do I remove that unnecessary lines in generated text?
1 Like
I received following warning. ‘The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input’s attention_mask
to obtain reliable results. Setting pad_token_id
to eos_token_id
:128001 for open-end generation.’
I fix it by changing code into
import re
example = prompt.format("You are mimicking a linux server. Respond with what the terminal would respond when a command given. I want you to only reply with the terminal outputs inside one unique code block and nothing else. Do not write any explanations. Do not type any commands unless I instruct you to do so.", "echo 'Hellow world'","")
inputs = tokenizer(
example,
return_tensors="pt",
padding=True,
truncation=True,
max_length=256 # Adjust max_length as needed
)
input_ids = inputs["input_ids"]
attention_mask = inputs["attention_mask"]
# Generate output
output = trainer.model.generate(
input_ids,
attention_mask=attention_mask, # Pass attention mask
max_length=256, # Adjust max length as required
num_return_sequences=1, # Adjust for multiple generations per input if needed
no_repeat_ngram_size=2, # To prevent repetition
top_k=50, # Top-k sampling
top_p=0.95, # Nucleus sampling
do_sample=True, # Sampling for diversity
temperature=0.7, # Control creativity
pad_token_id=tokenizer.eos_token_id # Explicitly set the pad token ID
)
# Decode and print the result
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)[len(example):]
print('text:', example)
print('generated text:', generated_text)
1 Like