I recently completed Chapter 2 of the HuggingFace course, and decided to write up a post explaining how transformers work, at a high level view and without any mathematics, so one can gain a general idea of their workings.
You can read it here:
If you have any comments, questions, suggestions, feedback, criticisms, or corrections, please let me know!