[Nov 15th Event] Jay Alammar: A gentle visual intro to Transformers models

sgugger · November 15, 2021, 2:17pm

Use this topic to ask your questions to Jay Alammar during his talk: A gentle visual intro to Transformers models.

You can watch it on YouTube or on Twitch at 8:45am PST

akshay7 · November 15, 2021, 4:49pm

Can we get the URL Jay was sharing at the start?

MoritzLaurer · November 15, 2021, 4:53pm

Here is the link from Jay’s talk in the beginning: jalammar.github.io/Simple_Transformer_Language_Model.ipynb at master · jalammar/jalammar.github.io · GitHub

wilmerags · November 15, 2021, 4:59pm

Do you see a future in symbolic learning rather than probabilistic approaches?

rajkumar · November 15, 2021, 5:04pm

Hi, Will these great video presentations be shared offline? So that one can watch later?

sgugger · November 15, 2021, 5:04pm

Hi, Will these great video presentations be shared offline? So that one can watch later?

The live stream can be viewed at any time on YouTube, and we will also edit to share each talk in a separate YouTube video

rajkumar · November 15, 2021, 5:07pm

That’s great. Thank you so much.

Abirate · November 15, 2021, 5:09pm

For T5, about 3 approaches have been used to fine-tune the pre-trained model for all the downstream tasks (1- fine-tuning all the pretrained layers (all params), 2- freeze all the pre-trained layers and adding adapter layers, and 3- gradual unfreezing the pre-trained layers over time (and it is clear in the paper (the tables showing GLUE for each of them). Ma question is about BERT, for BERT which of these approaches has been used for fine-tuning ?

rtrimana · November 15, 2021, 5:10pm

How do we know how much we should fine-tune a pre-trained model? Can you please share some strategies to produce a good fine-tuned pre-trained model (while incorporating that there would be some bias as well originating from the pre-trained model)?

sgugger · November 15, 2021, 5:11pm

This question is answered at 1:10:35 in the main stream

sgugger · November 15, 2021, 5:12pm

This is answered at 1:11:44 in the main stream.

NDugar · November 15, 2021, 5:32pm

what’s the easiest way to determine whether a text generation model is not racist or sexist? And how do we solve this quickly without losing too much of training data?

sgugger · November 15, 2021, 5:37pm

Could you move this question to the right topic ? Thanks in advance

Topic		Replies	Views
Finetuning options with SAM? Models	4	5225	May 11, 2023
[Nov 16th Event] Lewis Tunstall: Simple Training with the 🤗 Transformers Trainer Course	12	501	November 16, 2021
Tutorials on transformers Beginners	6	1489	May 21, 2021
Tutorial: Implementing Transformer from Scratch - A Step-by-Step Guide Show and Tell	5	4312	May 1, 2025
E5 embedding models 🤗Transformers	1	19	March 17, 2025

[Nov 15th Event] Jay Alammar: A gentle visual intro to Transformers models

Related topics