SFT - training on generations only

@ybelkada @lvwerra Firstly, thank you very much for all the wonderful work that y’all do! :hugs:

I am trying to understand the different between training on generations only vs training on prompt + generation (which I guess is default)
If I am building a codegen model with very particular instructions, which route would you suggest me to go with. Thank you for very much for taking your time to read this. I can’t wait to hear back from you.