For an experiment of mine, I am trying to train from scratch a causalLM, in particular Qwen/Qwen2.5-0.5B-Instruct, for a machine translation task. Since this an experiment and I am aware that achieving good performances would require both a great amount of time and resources, I decided to use as …

Train a CausalLM for machine translation

John6666 January 1, 2025, 11:22pm 2

Maybe unresolved issue? Or maybe it hasn’t been made into an issue on the transformers github.

Topic		Replies	Views
Training a CausalLM from scratch for a machine translation task Models	3	79	January 10, 2025
How to train a translation model from scratch Beginners	9	12575	March 1, 2022
Speculative Decoding with Qwen Models 🤗Transformers	1	334	March 5, 2025
Training CausalLM to imitate Seq2SeqModel 🤗Transformers	2	665	October 10, 2024
Adding New Tokens to MarianMT Model 🤗Tokenizers	8	758	February 4, 2024