Currently I tried fine tuning GPT2 with default parameters for search query autocomplete but it is not giving results from the training data I have provided I used nearly 70k search strings separated by the <|endoftext|> token I fine tuned with the default parameters. currently it is generating some random text what should I do should I use something else or am I missing something?
I am using below code for query auotocomplete
from transformers import AutoModelWithLMHead, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("output1/")
model = AutoModelWithLMHead.from_pretrained("output1/")
input_ids = tokenizer.encode('Vegetative reproduction of Agave', return_tensors='pt')
# set return_num_sequences > 1
beam_outputs = model.generate(
input_ids,
max_length=50,
num_beams=10,
no_repeat_ngram_size=2,
num_return_sequences=10,
early_stopping=True
)
# now we have 3 output sequences
print("Output:\n" + 100 * '-')
for i, beam_output in enumerate(beam_outputs):
print("{}: {}".format(i, tokenizer.decode(beam_output, skip_special_tokens=False)))