I recently started to learn NLP and started to use HuggingFace(HF) library. I am working on a problem and wanted the guidance.
I am trying to fine-tune a GPT2 model for course summary and its Title generation. For measuring the performance of the model, I was suggested that I can add some new data rows in my data where the pair of course summary and its title is not correct (or false). Then I can add one more column in the dataframe as label where I can provide 1 for correct course summary & title pair and 0 for false pair and pass this new column as label in custom Trainer of HF.
This new data(with 3 columns as course summary, title and label for summary & label(0 or 1)) is passed with custom Trainer and pass label inside ‘compute_loss’. So that the model will be trained as classification task(purpose is to train the model about what is the correct pair of course summary & its title so while generating the titles given course summary it will generate correct titles). After training, we can evaluate the model performance for generated titles with hold-out set with
EvalPrediction to check if the title names are correct or not for course summary.
Please let me know if this is the right way to implement this technique to generate text and measure the model effectiveness(using AUC/ROC based on classification ). If not, please guide me with the steps. Thank you.