Has anyone fine-tuned a llama 3.2 1B models for a multilingual translation task? I am trying to fine-tune a base 1b model for english, japanese, chinese, and korean multilingual task and comparing the models using bleu scores across 12 different language pairs… it seems impossible to achieve 30 in bleu.
I have tried using CPT for pretraining the base model with parallel corpus of these four languages + fine-tuning with the instruction dataset
Maybe the model is too small for this task? Any insights would be helpful thanks