Tasksource - Datasets harmonization for frictionless multi-task/evaluation

sileod · February 10, 2023, 9:41am

Hi everyone, I’m the author tasksource. I was annoyed dataset alignment/harmonization for multitask learning, so I created a concise “language” to represent the harmonization preprocessings.

With a pip install and 3 lines of code, you can load hundreds of datasets that can be used interchangeably. I also trained a deberta-base-tasksource multitask model on all tasksource tasks and obtained very good results, I would say that it’s the best “base-size” model on huggingface for classification/NLI/multiplechoice (it beats all of them on model-recycling evaluation.

I’m open to contributions and suggestions, thanks !

Topic		Replies	Views
Share a Multi-Task Model on the huggingface Hub Models	0	710	September 20, 2022
Grouphug: multi-task, multi-dataset training with 🤗 transformers/datasets Research	0	2519	June 15, 2022
Large image dataset, feedback and advice: data viewer, task template, and more 🤗Datasets	5	916	November 22, 2022
Share your projects! Course	19	3842	February 18, 2025
Questions when using multiple datasets to finetune Deberta Research	0	145	April 6, 2024

Tasksource - Datasets harmonization for frictionless multi-task/evaluation

Related topics