Fine Tune text generation Model using different type of data

Hi, I want to fine-tune a model for Text generation purposes using multiple datasets, i.e.
Suppose i have a dataset

  • sciq
  • metaeval/ScienceQA_text_only
  • GAIR/lima
  • Open-Orca/OpenOrca
  • openbookqa
    and I want to train my model using all this data, so I am not able to figure out how to do the preprocessing part. and train my model using multiple datasets.