Is it possible to train a translation model from scratch that translates from English to infrastructure files?

Hi friends, I’d like to train my own model (custom tokenizer and model) for performing translations from natural language (English) to infrastructure as code files. If this is possible, can anyone provide a link to a starting documentation for “from scratch” training? I have attached a link to my dataset for reference.