Hi, I want to fine-tune a model for Text generation purposes using multiple datasets, i.e.
Suppose i have a dataset
- sciq
- metaeval/ScienceQA_text_only
- GAIR/lima
- Open-Orca/OpenOrca
- openbookqa
and I want to train my model using all this data, so I am not able to figure out how to do the preprocessing part. and train my model using multiple datasets.