Vanilla Transformer

Hi all,

Is the transformer model and tokenizer used in the paper ‘Attention is all you need’ available in HF?

I want to reproduce the result in the paper.
( ‘We use Transformer (Vaswani et al., 2017) as the basic model structure’)
They got 28.4 bleu score using the basic transformer model on en-de task.

Any help will be appreciated!

1 Like

Hi @Onlydrinkwater ,

Were you able to get the implementation of the model and tokenizer ?
If yes then can you please share it with me.
Also, if you were able to replicate the results, can you please share any tips and tricks to do the same.