How to make pure transformer model

RaTi-GK · May 22, 2024, 6:32am

Hello,

Before I explain, please understand that I am not from an English-speaking country and my English may not be the best.

I’m preparing an experiment to replicate(reproduce) about “attention is all you need” paper.

Since “attention is all you need” is the original paper on transformer, I want to implement pure transformer with huggingface.

Implementing through huggingface is convenient because it provides a generate() function with multiple arguments.

So how can I implement the same encoder-decoder transformer model as the structure of ‘attention is all you need’?

Thanks.

Topic		Replies	Views
Original transformers model implementation Beginners	2	977	June 1, 2022
How to build and evaluate a vanilla transformer? Models	0	126	June 26, 2024
How to train a translation model from scratch to reproduce <attention is all you need>? Beginners	0	401	November 29, 2022
Tutorial: Implementing Transformer from Scratch - A Step-by-Step Guide Show and Tell	5	5069	May 1, 2025
Multi-Encoder Transformer Models	0	304	August 14, 2023