[Nov 15th Event] Jay Alammar: A gentle visual intro to Transformers models

Use this topic to ask your questions to Jay Alammar during his talk: A gentle visual intro to Transformers models.

You can watch it on YouTube or on Twitch at 8:45am PST

2 Likes

Can we get the URL Jay was sharing at the start?

1 Like

Here is the link from Jay’s talk in the beginning: jalammar.github.io/Simple_Transformer_Language_Model.ipynb at master · jalammar/jalammar.github.io · GitHub

4 Likes

Do you see a future in symbolic learning rather than probabilistic approaches?

Hi, Will these great video presentations be shared offline? So that one can watch later?

Hi, Will these great video presentations be shared offline? So that one can watch later?

The live stream can be viewed at any time on YouTube, and we will also edit to share each talk in a separate YouTube video :slight_smile:

2 Likes

That’s great. Thank you so much.

For T5, about 3 approaches have been used to fine-tune the pre-trained model for all the downstream tasks (1- fine-tuning all the pretrained layers (all params), 2- freeze all the pre-trained layers and adding adapter layers, and 3- gradual unfreezing the pre-trained layers over time (and it is clear in the paper (the tables showing GLUE for each of them). Ma question is about BERT, for BERT which of these approaches has been used for fine-tuning ?

How do we know how much we should fine-tune a pre-trained model? Can you please share some strategies to produce a good fine-tuned pre-trained model (while incorporating that there would be some bias as well originating from the pre-trained model)?

This question is answered at 1:10:35 in the main stream

This is answered at 1:11:44 in the main stream.

what’s the easiest way to determine whether a text generation model is not racist or sexist? And how do we solve this quickly without losing too much of training data?

Could you move this question to the right topic ? Thanks in advance :slight_smile:

1 Like