How to process the data for some SUPERGLUE tasks such as COPA or Record

Zubi401 · August 3, 2021, 10:09pm

I am trying to test out a model which uses the HF API on the SUPERGLUE tasks. I have not been able to figure out how to properly preprocess the data to use with my model. So far, from reading a few papers (most specifically Roberta), it is mentioned that the authors use the following approach: sentence + “because” + choice1, sentence + “because” + choice2. This part seems straightforward, however, what I’m not too sure about is writing the custom loss function to work with my Trainer() loop. I can’t really wrap my head around choosing the sentence choice that yields the highest probability and then using that for loss… similarly for the other tasks which also have sentence-pairs and choices

Has anyone come across this before and knows the answer? Much appreicated!

Topic		Replies	Views
Replication of the performance of RoBERTa on the COPA task Models	0	543	December 19, 2022
Dataset Preparation for Q&A FineTuning Beginners	0	449	September 28, 2023
How to conquer "write a preprocessing function that works on any of the GLUE tasks."? Beginners	1	123	July 11, 2024
Seeking Advice on Processing Support Conversations for Efficient RAG Model Search Intermediate	0	50	September 9, 2024
Custom tokenizer: finetune model or retrain model? 🤗Transformers	1	918	March 8, 2024

How to process the data for some SUPERGLUE tasks such as COPA or Record

Related topics