Hi everyone, I’m the author tasksource. I was annoyed dataset alignment/harmonization for multitask learning, so I created a concise “language” to represent the harmonization preprocessings.
With a pip install and 3 lines of code, you can load hundreds of datasets that can be used interchangeably. I also trained a deberta-base-tasksource multitask model on all tasksource tasks and obtained very good results, I would say that it’s the best “base-size” model on huggingface for classification/NLI/multiplechoice (it beats all of them on model-recycling evaluation.
I’m open to contributions and suggestions, thanks !