Hi I was wondering how to select model with ideal accuracy vs training time tradeoff so I collected results of few most popular models and measured their training times of 1 epoch on couple different tasks
Results are here: Loading Google Sheets (open in excel not google sheets)
My idea of using these results is as follows:
- Pick baseline model
- Look on related plots to see if there are models achiving comparable accuracy but are faster
- Stay with baseline or pick better model
My concerne is that 1 epoch of training for model a may be twice as fast as training time of model b but model a may need 3 times more epochs to achieve results of model b. It’s usually not specified in the papers how many epochs were used for fine-tuning so glue or squad results compared to 1 epoch train time may be missleading.
What do you guys think is my idea good or not? How do you select models for your tasks?