Hi there,
In Huggingface transformers lib, is there implementation of the original transformers model mentioned in the paper, “Attention is all you need”.
I just browsed the documentation site and saw the implementation like BERT, T5, etc., but I didn’t see the original model.
Thanks for the clarification.
Kind Regards