How to train a transformer from scratch

Hi all!

I am trying to train a transformer from scratch using either HuggingFace or the NanoGPT repository. The task will be to generate questions from declarative sentences (e.g., the dog does bark → Does the dog bark?).

My problems are:

  1. Where can I find a good and straightforward way to train a transformer from scratch?
  2. How do I structure the dataset? I was thinking of something like this:
    The dog does bark \n Does the dog bark?[EOS]

Thank you all!:hugs:

I found this one useful when I tried to understand how transformers work.
https://nlp.seas.harvard.edu/annotated-transformer/#encoder-and-decoder-stacks