[Data processing] How to design a training loop for custom data by GPT2 model

Hello,
I am newbie and currently I am implement a generation tool to instruction/explain some text.
ex: my data is:
[BOS] text 1[MOS] text 2[EOS]

My purpose is train a “text 1”, then text generate by “text 2”.
Could you help us:

  • How to design a training loop or what’s paramenter should be used to do it?
  • Please share me some key point to make a custom model by using above data.
    Thank you.

I recommend looking at this: Question answering

1 Like