T5-base results are worse than t5-small

alzoubi36 · November 9, 2023, 9:55am

Hi everyone,

I pretrained T5 small, base and large on the PrivaSeer corpus with a spanned MLM objective. I called the pretrained model PrivaT5. Then finetuned PrivaT5 and T5 small, base and large on some tasks of the PrivacyGLUE benchmark. You can see the results in these plots:

For all model sizes I used the same hyperparameters except for the batch size I changed it to make the model fit on the TPU. Example :

--model_name_or_path="t5-base"
--hub_save_name_or_path="t5-base"
--model_type="t5-base"
--config_name="t5-base"
--tokenizer_name="t5-base"
--max_seq_length="512"
--per_device_train_batch_size="16"
--per_device_eval_batch_size="16"
--adafactor
--learning_rate="0.001"
--weight_decay="0.0"
--warmup_steps="0"
--overwrite_output_dir
--logging_steps="500"
--save_steps="50"
--eval_steps="50"
--num_train_epochs="100"

Could anyone give me possible reasons why the PrivaT5 base performance unexpectedly drops on the OPP-115 and Policy-Detection tasks compared to PrivaT5 small? (Multilabel text classification & binary text classification respectively).

Thank you!

Topic		Replies	Views
Pretraining T5 from scratch using MLM Models	1	393	December 6, 2024
T5-small performance degradation with larger dataset: seeking advice Models	0	62	July 4, 2024
mT5/T5v1.1 Fine-Tuning Results Models	16	7471	March 8, 2022
Fine-tuning MT5 on XNLI Beginners	1	1774	October 16, 2021
How many steps or epochs to finetune T5-small/base/large on XSum? 🤗Transformers	0	1399	August 7, 2021

T5-base results are worse than t5-small

Related topics