How to build and evaluate a vanilla transformer?

Bachstelze · June 26, 2024, 9:44am

EncoderDecoderModels are supported via the huggingface API. Though it isn’t possible to evaluate them as AutoModel: #28721
How is it possible to build and evaluate a vanilla transformer with an encoder, cross-attention, and a decoder in huggingface?

Model description

“Attention Is All You Need” is a landmark 2017 research paper authored by eight scientists working at Google, responsible for expanding 2014 attention mechanisms proposed by Bahdanau et al. into a new deep learning architecture known as the transformer with an encoder, cross-attention, and a decoder.

Topic		Replies	Views
How to make pure transformer model Beginners	0	136	May 22, 2024
Vanilla Transformer Beginners	1	1177	June 6, 2023
How to train a translation model from scratch to reproduce <attention is all you need>? Beginners	0	400	November 29, 2022
Original transformers model implementation Beginners	2	976	June 1, 2022
Has vanilla transformer implemented in transformers library? 🤗Transformers	3	1936	June 5, 2022

How to build and evaluate a vanilla transformer?

Model description

Related topics