I was wondering if I can finetune a model such as Falcon-7B to perform two tasks at once i.e. answer and translate. Let’s assume I have a dataset with questions in English and answers in Urdu, will fine-tuning be enough, or I will need to train from scratch? The second option will require me to collect a huge dataset, which I really want to avoid.
Thanks for any help and guidance.
BTW limited to an RTX 4090 for now, but a test run of this is working
It’s definitely possible to fine-tune an LLM like Falcon-7B on various tasks at the same time (also called multi-task fine-tuning). However, looking at the model card: tiiuae/falcon-7b · Hugging Face, this model was pre-trained on English and French only. Hence it probably doesn’t know a lot of Urdu. It might be better to look for an LLM that has Urdu in its pre-training data. You can filter the HF models by “text generation” and “language=urdu”: Models - Hugging Face.
Alternatively, you could also take a look at existing model primarly targeted towards machine translation, such as Meta’s NLLB model series: NLLB. This one can translate between 196 languages, with a single model. Hence you could further fine-tune it if you have additional translation pairs.
Thank you for the response nielsr. The task from English to Urdu will just be a test case, and eventually, we will be training the model for translating some astronomical data to its brief English summary (fingers crossed for that one). I will try to find a model which has been pretrained in multiple languages, as well as small enough to fit on the GPU I have. I should have a few A100s in a couple of months, which will allow me to train/fine-tune bigger models.