Getting wrong response after fine tuning google/flan-t5-small model?

I need to create a generative question-answering system. I selected the flan-t5-small model and fine-tuned it on the 433 questions-answer pair dataset. but, after fine-tuning, I tried to inference on some questions. but the fined-tuned model generates questions itself as a result.
Also, suggest the best model for this problem statement.

1 Like