Tasksource - Datasets harmonization for frictionless multi-task/evaluation

Hi everyone, Iā€™m the author tasksource. I was annoyed dataset alignment/harmonization for multitask learning, so I created a concise ā€œlanguageā€ to represent the harmonization preprocessings.

With a pip install and 3 lines of code, you can load hundreds of datasets that can be used interchangeably. I also trained a deberta-base-tasksource multitask model on all tasksource tasks and obtained very good results, I would say that itā€™s the best ā€œbase-sizeā€ model on huggingface for classification/NLI/multiplechoice (it beats all of them on model-recycling evaluation.

Iā€™m open to contributions and suggestions, thanks !

1 Like