What are the Latest Methods to Evaluate Instruction-Tuned Model on a Custom Test Set?

supercoolaj · November 17, 2023, 4:23am

Hello everyone, I’m wondering if there’s any progress on how to automatically evaluate instruction-tuned models for custom datasets. (i.e. how to compare the ability to follow instructions among models). Currently, I’m finetuning a model and want to evaluate it on the self-instruct test set.

What I vaguely know about is either having a human or a better model act as judge, but I wonder if there’s any new development. References to papers are certainly welcome. Thank you.

Topic		Replies	Views
Finetuning on base or instruct model? Beginners	0	1699	April 6, 2024
Evaluating pretrained model Beginners	0	308	July 26, 2021
Using same instructions for fine-tuning: Is this bad for the model? Intermediate	1	457	March 26, 2024
Problems with understanding instruction fine-tuning Beginners	0	450	April 2, 2024
Autotrain Advanced Cost Beginners	0	442	November 20, 2023

What are the Latest Methods to Evaluate Instruction-Tuned Model on a Custom Test Set?

Related topics