Creating t5 for language

im trying to build a QA system for tamil language
i prepared dataset for building a language model (raw text 5gb size)
and also for downstream task Question answering(format : context, question, answer)

now what are all the steps i should follow to create a QA system