My PyData Global 2025 presentation”I Built a Transformer from Scratch So You Don’t Have To” is now available

Hi Everyone,
I want to give you all an update that the recording of my PyData Global 2025 presentation
”I Built a Transformer from Scratch So You Don’t Have To” is now available on YouTube.

This presentation covers the following topics and is based on the “Implementing Transformer from Scratch” tutorial on Hugging Face hub:

• How the original Transformer architecture works

• How to translate each component into PyTorch

• Key ideas: attention, masking, positional encoding, FFN

• A decoder-only forward pass, step-by-step

• Common implementation bugs — and how to debug them

• Where to go next (code, tutorials, training references)

Every line of code in the tutorial was manually verified and tested to match the 2017 paper’s equations. I hope you find the presentation and/or the tutorial helpful.

:microphone: you can find the recording here
:link: to repo: bird-of-paradise/transformer-from-scratch-tutorial · Datasets at Hugging Face

Feel free to like :+1: , share :hugs: , and spread the open science :nerd_face: !

– Jen :hibiscus:

1 Like