How to train a transformer from scratch

IParraMartin · May 2, 2024, 9:41pm

Hi all!

I am trying to train a transformer from scratch using either HuggingFace or the NanoGPT repository. The task will be to generate questions from declarative sentences (e.g., the dog does bark → Does the dog bark?).

My problems are:

Where can I find a good and straightforward way to train a transformer from scratch?
How do I structure the dataset? I was thinking of something like this:
The dog does bark \n Does the dog bark?[EOS]

Thank you all!

nimatov · May 2, 2024, 9:59pm

I found this one useful when I tried to understand how transformers work.
https://nlp.seas.harvard.edu/annotated-transformer/#encoder-and-decoder-stacks

Topic		Replies	Views
Simple example of Transformer from scratch? Beginners	2	6218	December 25, 2023
Train a transformer from scratch 🤗Transformers	0	434	August 9, 2021
How to Train a Generative Pre-training Transformer Beginners	0	138	May 26, 2024
Using HF to train a custom PyTorch architecture Beginners	0	511	July 29, 2022
Way to train a basic Transformer Beginners	6	636	November 21, 2020

How to train a transformer from scratch

Related topics