Train a CausalLM for machine translation

Maybe unresolved issue? Or maybe it hasn’t been made into an issue on the transformers github.